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of obvious practical value, to express x in terms of y. Since only y is 
directly observable this is a primary need in making a linkage map. 
Early attempts at this problem were made by Haldane, and a few years 
ago a remarkably effectual formula was put forward by the Indian 
mathematician, Kosambi, namely that 

2t/ = tanh(2&). 

Kosambi ? s formula leads to an addition theorem for the recombina¬ 
tion fraction of the sum of 2 segments,' 

2/l + 2/2 

* 1+2 1 + 4 y lV% ' 

and with three genetic markers this may be tested by direct comparison 
with the data. In such tests Kosambi’s formula usually shows up very 
well, yet it can scarcely be accurate in all cases, for it takes no account 
of the position of the segment in relation to the centromere, a factor 
which has long been known greatly to influence interference. 

In the second place, even if it were accurate in expressing a fixed 
relationship between x and y, a formula of this kind could not supply 
a complete theory. For work with three factors the relationship be¬ 
tween x and y is sufficient, but with four or more factors segregating 
simultaneously it is easily seen to be insufficient. With four factors, 
for example, there are eight pairs of complementary genotypes, and 
seven constants are needed to specify their relative frequencies; yet an 
addition formula, such as that given above, can only give the recom¬ 
bination fractions between the six pairs of loci. A further parameter 
is needed to specify the gametic series. In fact, only the complete 
series of expressions for $ l9 $ 3 . . . in terms of some common pa¬ 

rameters can give a theoretically complete account of recombination. 

2. MATHER’S SKETCH OP CROSSING OVER AS A RANDOM PROCESS 

I should like to recall at this point some quite early proposals made 
by K. Mather in which the formation of chiasmata in a four strand 
pair of chromosomes was regarded as a random process, starting from 
the centromere, in such a way that the interval between the centromere 
and the chiasma next to it was assigned a definite frequency distri¬ 
bution, and the interval between the first chiasma and the second was 
treated in the same way. It is evident that with proper specification 
of these distributions, calculation should be able to provide the fre¬ 
quencies of all definable configurations, "Specifying the incidence of 



4 


BIOMETRICS MARCH 1948 


simultaneous sets of chiasmata in the same chromosome. 

I have followed Mather’s ideas in this matter with some variations. 
First, I consider only the configurations provided by sets of chiasmata 
on the same strand, for these are enough to specify the genetic situa¬ 
tion. Next, it appears that good agreement with the facts may be ob¬ 
tained by introducing a metric which is not to be identified with map 
distance, or with physical length, in terms of which interference is a 
constant property at all parts of the strand. In this metric it appears 
to be sufficient, at least as a good approximation, to treat the cen¬ 
tromere as if it were an obligatory chiasma. In terms of this metric 
it is then theoretically possible to calculate the series s lf s 2f s 3 , . . . . 
for any chosen points, and in particular the values of x and y, so as to 
give full information about recombination for all regions of a given 
map. 

3. THE SEX CHROMOSOME OF THE HOUSE MOUSE 

Some years ago, when Mather was working with me at the Galton 
laboratory, we carried out, with the able assistance of Mrs. S. Holt, an 
experiment which at the time must have seemed rather extravagant. 
Although the laboratory rodents had long been the best studied species 
among the mammals, yet sex linkage had not appeared among them. 
In these circumstances, it seemed to us worth trying whether any of the 
factors we had readily available were, as was possible, carried in the 
pairing segment of the sex chromosome, and in consequence exhibited 
partial sex linkage. In the event, none of the six factors we tested: 
pied, agouti, brown, dilute, wavy (wv i), and light head, appeared to be 
linked with sex in a test with about twelve hundred mice. Yet before 
our experiment was ready for publication, J. B. S. Haldane, working 
in the same College had made a search for partial sex linkage in Man 
and had found pedigrees of several rare defects appearing to be so 
inherited. The test for a similar phenomenon in mice, was evidently 
not so hopeless. 

When in 1943 I moved to Cambridge I was able, through the gen¬ 
erosity of The Rockefeller Foundation and the kindness of Dr. Snell of 
the Roscoe B. Jackson laboratories to increase the variety of my mouse 
stocks, and, about two years ago Mrs. M. Wallace, then Miss Margaret 
Wright, noticed a certain irregularity in the sex ratios of some of the 
lines which we kept segregating for the linked factors wavy-two and 
shaker-two. In consequence we decided to set up an experiment which 
we hoped would be competent to discover what was really happening 
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at these two loci. The effect to be demonstrated was evidently not very 
large, and there were reasons to fear, as there usually are with reces¬ 
sive factors, and especially shakers, that the evidence would be dis¬ 
turbed by differential viability. Consequently, the experiment had to 
be carefully designed. 

There are four possible types of male heterozygous in these 2 fac¬ 
tors, for the animal may have received from his father, either wavy or 
shaker or both or neither, with a complementary set of genes from the 
mother. Of these four male types that in which neither recessive 
comes from the father and both from the mother, i.e. 

wv 2 shz X/+ + Y (Tricoupling) 

may be regarded as in coupling for all three factor pairs, while the 
other three types differ from it respectively by interchanging sex (X, 
Y), wavy (+, u>v 2 ), and shaker (+, sh 2 ) giving therefore 

+ + X/wv 2 sh 2 Y 

+ sh 2 X/wv 2 + Y 

wv 2 + X/+sh 2 Y 

Each of these male types mated with homozygous wavy shaker does, 
wv 2 she X/wv 2 sh 2 X 

produces eight genotypes, four male and four female, which fall into 
four pairs of complementary genotypes 


Wavy shaker females and normal males A 
Normal females and wavy shaker males B 
Shaker females and wavy males 0 

Wavy females and shaker males D 


The pairs of genotypes B, 0, D are derived from the pair A, by the 
same three interchanges of male with female, of normal with wavy, and 
of normal with shaker. 

In relation to any one of the types of heterozygous fathers these 
four pairs of genotypes are produced by four modes of gamete forma¬ 
tion of which one, in the ease of three loci in the same chromosome, may 
be identified as old combinations, while in the others recombination has 
taken place in one segment or the other, or in both. Without knowing 
the order of the loci we cannot tell which of these is which; but in all: 
cases the three modes of formation involving recombination will differ 
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from the first, in which no recombination has occurred, by the three 
operative interchanges for sex, wavy and shaker. 

Consequently the data obtained when all four male types are used 
will consist of sixteen frequencies, each of a pair of complementary 
genotypes arranged in a symmetrical four by four Latin square, as 
shown below: 


Type of gamete formation 


Type of male 

No 

change 

Sex 

change 

Wavy 

change 

Shaker 

change 

wv 2 $h a X/+ + Y 

A 

B 

C 

D 

+ + X/wv 2 s% 2 Y 

B 

A 

D 

0 

+ sha X/wv a + Y 

C 

D 

A 

B 

wv z + X/+sh 2 Y 

D 

C 

B 

A 


The experimental purpose of obtaining data in this form is to allow 
of the estimation of the frequencies of the four modes of gamete forma¬ 
tion, free from disturbances introduced by unknown differences in the 
viabilities of the four pairs of genotypes, and of arbitrary differences 
in the numbers of offspring bred to the four types of male. This sta¬ 
tistical problem is intricate and has never been adequately discussed. 
Fortunately, however, the particular data obtained with the factors 
wavy-two and shaker-two in mice offer no difficulty, for there were in 
our results no signs of differential viability, as appeared from the com¬ 
plete homogeneity of the four distributions shown by the rows of the 
table. 

The relative frequencies of the four modes of gamete formation may 
in such a case be estimated directly from the total frequencies in the 
columns. Out of about 450 mice bred it was found that the first 
column contained about 29%, the second 41%, and the remaining 
col umns about 15% each. I need not emphasize that such a result ap¬ 
peared highly paradoxical. For, in the first place, if sex linkage were 
not involved, we should expect equality in the first and second col umns , 
as we would also in the third and fourth; whereas the first two columns 
differ consistently for all types of male, and in the aggregate by over 
50 mice. Next, if sex-linkage were involved we should expect entries 
either in the third or in the fourth column to be very rare double cross¬ 
overs, but these two classes are nearly equally frequent. Finally with 
incomplete sex-linkage we should expect the sex interchanges to be less 
numerous than the cases of no recombination. 







THEORY OF GENETIC RECOMBINATION 


7 


In respect to recombination frequencies we have:— 

More accurately 

Wavy-shaker 15 +15 = 30% 31.1% ± 2.17 

Sex-shaker 41 + 15 = 56% 56.1% ±2.33 

Sex-wavy 41 +15 = 56 % 56.7% ± 2.33 

So that the paradoxical nature of our data may be restated by saying 
that both the locus for wavy, and that for shaker, have significantly 
more than 50% recombination with sex • moreover, these recombination 
fractions are nearly equal, although the loci must be 30 to 35 map units 
of distance apart. 

4. TOPOGRAPHICAL INTERPRETATION OF THE DATA 

In the male mouse it is to be supposed that the inheritance of sex 
takes place as if a sex determining locus were situated at the point of 
separation of the pairing segment from the differential segment. For 


Figure 1 

Frequency Distribution of Length of 
Intercept Between Successive Breaks 



purposes of calculation it is convenient to take this point as origin. 
The occurrence over a considerable length of the pairing segment of 
recombinations with sex exceeding 50 percent, becomes more intelligible 
if we consider that the complete absence of chiasmata in the differential 
segment of these strands must mean an absence of interference from 
this side; so that, on a metric chosen for uniformity in respect of inter- 
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ference, this region will be favored by chiasmata. Moreover, calcula¬ 
tion taking any plausible form (see fig. 1) for the distribution 
of the intercept length such as assigning to it the frequency element 


sech tanh ^2 ) 


where u is the metrical value, and the coefficient ^ w is introduced to 


make the metric conform to map distance in regions far from the arm 
ends, or from the centromere, shows that this expected effect is realized. 

If now numerical values of s 1} s 2 , s 3 ,... . are tabulated for various 
distances from the end, the value of the sex recombination fraction, y, 
and the map distance, x, may each be obtained in terms of u. Then, 
plotting the recombination fraction y against the map distance #, it 
appears that y rises at first in equality with x, but passes the 50 percent 
value at a map distance of about 60 map units and rises to a long flat 
maximum between 55% and 56% recombination. 


Figure 2 

Recombination Fraction in Terms of Map Distance 



The genetical findings with wavy and shaker would therefore be 
predicted by theory if these two loci were situated about 30 units apart 
in the neighborhood of the maximum recombination fraction and 
probably on either side of it. 

In a more detailed study George Owen has shown how the effects of 
finite lengths of the chromosome arms may be taken into account; from 
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his work it appears also that recombination fractions exceeding 50 per¬ 
cent are characteristic of a wide class of intercept distribution func¬ 
tions, provided interference is sufficiently pronounced. He has found 
models in which explicit analytic functions may be used to replace the 
somewhat cumbrous numerical integration I had employed. 

As I mentioned in my opening remarks, very much detailed work 
remains to be done, although, so far as we have explored the question, 
the genetical data have shown themselves entirely conformable with the 
numerical details of the theory. The exact degree of confirmation ob¬ 
tainable from the available data for Drosophila will, however, not be 
clear until a more thorough study has been made of the relationship 
of coincidence values, appropriately estimated, to the chromosome 
regions in which they are found. 


DISCUSSION 

D. G. Catcheside. The occurrence of recombination in excess of 
50 per cent requires that there should be chromatid interference favor¬ 
ing a frequency of four-strand doubles in excess of that expected in the 
absence of chromatid interference. 

When two successive chiasmata occur in an arm of a pair of chro¬ 
mosomes, the relationships between the four chromatids may be such 
that two strands (complementary), or three strands (diagonal) or four 
strands (reciprocal) take part in the crossing-over at the two chiasmata. 
If the participation of a given strand in one chiasma does not affect the 
chance of its participation in the other chiasma in the arm, the strand 
relationships should occur in double crossing-over with the frequencies 
of a quarter complementary, a half diagonal and a quarter reciprocal. 
This absence of chromatid interference is found in Drosophila melano- 
gaster and with such absence the amount of recombination between 
linked genes cannot rise about 50 per cent. 

Chromatid interference may be of two types; either (a) the par¬ 
ticipation of a chromatid in one chiasma reduces its chance of partici¬ 
pation in a second chiasma (positive interference) or (b) participation 
of a chromatid in one chiasma increases its chance of participation in 
a second chiasma (negative interference). The latter, in which the 
upper limit of recombination would be less than 50 per cent, has been 
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observed in Neurospora crassa, where 2-strand doubles are in excess 
and 4-strand doubles are deficient. The former type, which could in 
some circumstances lead to an amount of recombination between two 
loci in excess of 50 per cent, has not so far been observed. It may be 
that crossing-over in the heteromorphic pair of sex chromosomes in the 
male mouse is of this type, but it must not be assumed that positive 
chromatid interference is of general occurrence. 

While it is true that most genetical experiments have shown an 
absence of chiasma interference across the centromere, recent work of 
Callan and Montalenti (Journal of Genetics 48, 119,* 1947) has con¬ 
firmed the occurrence in Culex pipiens of positive chiasma interfer¬ 
ence across the centromere as previously recorded by Patan. Thus the 
centromere cannot always be taken as a null point. 


Hilda Geiringer. Professor Fisher’s new and extremely interest¬ 
ing theory of recombination suggests the following question: The com¬ 
plete linkage of m factors is characterized by M - 2 m ~ 1 -1 parameters. 
In fact, denote the constitution of an organism by (x lf x 2 , . . . x m ; y l9 
y 2 ,... y m ) where the maternal (paternal) heritage with respect to i-th 
factor is denoted by x\ or y* respectively. A gamete may then be 
described, for example, by x l9 x 2) y 3 , y±, y s , ... x m ; there are 2 m gametes 
of this sort and there exists a probability distribution corresponding to 
these 2 m possibilities. Admitting obvious symmetry relations this num¬ 
ber, 2”, will be divided by two and we thus have 2 m_1 probability values 
with sum equal to one, hence M parameters. This “linkage distribu¬ 
tion” (l.d.) is completely equivalent to the cross-over distribution, or 
“recombination distribution” (r.d.) which (in the “linear theory”) 
specifies the probabilities that in each of the (m -1) intervals between 
the loci of linked factors recombination takes place or does not take 
place. 

On the other hand this r.d, is, in general, not equivalent to the 
27 = m(m-1)/2 recombination values. In fact M>N for mg:4. 
From the r.d. (or from the l.d.) the recombination values follow, but 
not vice versa. It is, however, an aim to be able (by means of observa¬ 
tion and theory) to reduce the number M of independent parameters 
and to express the M values by means of the N recombination values 
or possibly by part of them, e.g. by the (m-1) recombination values 
between adjacent factors. May I ask Professor Fisher’s opinion regard¬ 
ing this question, in the light of his new theory? 
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Alexander Weinstein, j Recombination exceeding fifty per cent . 
The recombination frequency between two loci depends on (1) the 
number of chromatids that cross over at a level, ( 2 ) which chromatids 
these are (whether homologous or sister strands), (3) the number of 
levels at which crossing over occurs, and (4) the relation of exchanges 
at different levels to one another (whether regressive, progressive, or 
digressive). 

If at any level only two chromatids of a tetrad cross over and these 
are always homologous and not sister strands, then, on certain assump¬ 
tions, in crossover tetrads with n exchanges between two loci, the pro¬ 
portion of chromatids showing recombination between these loci is i 
when n is odd and $ - $ (F - E) n/2 when n is even; where F is the chance 
that an exchange is regressive and E the chance that it is digressive 
with respect to an adjacent exchange . 1 These formulas apply not to 
chromatids from all tetrads taken together, but to chromatids from 
tetrads of a specified rank n. 

If F * E, then in a group of crossover tetrads of rank n, the recom¬ 
bination frequency of emerging chromatids remains ^ as n increases. 
If F exceeds E, the frequency will oscillate between and a lesser 
value which approaches f as n increases indefinitely. If F is less than 
E } the recombination frequency oscillates between values greater and 
values less than i, and these values approach £ as n increases indefi¬ 
nitely. That is, the recombination frequency in chromatids derived 

from tetrads of rank n exceeds \ if F is less than E and ^ is odd; and 

if such tetrads are sufficiently numerous, the recombination frequency 
in chromatids from all tetrads taken together may exceed 50 per cent. 

These results assume that F and E are constant throughout the 
chromosome. If either or both vary according to the regions involved, 
the recombination frequency in chromatids derived from tetrads with 
exchanges in n specified regions between two loci remains i when n is 
odd; and when n is even, the recombination frequency becomes 
i-KFz-Ez) (Fi-Et) ( Fq-Eq ) . . . ( Fn-En), where the subscript 
indicates the second of the two adjacent regions involved; for example, 
F 4 is the chance that an exchange in region 4 is regressive with respect 
to an exchange in region 3. The recombination frequency in chro¬ 
matids derived from a crossover tetrad of rank n would exceed i i£ F 


1 Weinstein, Alexander, “Mathematical study of multiple-strand crossing over and 
coincidence in the chromosomes of Drosophila,” American Philosophical Society Year 
Booh 1937 , pp. 227-228, 1988. 
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is less than E in an odd number of factors, provided that in the other 
factors F exceeds E . If F equals E in one or more factors, the recom¬ 
bination frequency is 

The recombination frequency might also exceed 50 per cent if an 
exchange between two chromatids of a tetrad were accompanied by an 
exchange between the two others at the same level. This would yield 100 
per cent recombination in chromatids derived from tetrads with cross¬ 
ing over at one level: the result would be the same as if the two ex¬ 
changes were at different levels and digressive with respect to each 
other. If two exchanges at the same level are treated as mutually 
digressive, the formulas already given can be applied, whatever the 
number of levels at which there is crossing over; provided that n is 
taken as the number of exchanges, which is now no longer identical 
with the number of levels at which exchanges occur. 

The relative frequency of regressive, progressive, and digressive 
crossing over also affects other genetic functions, including coincidence 
of various types. In attached X-chromosomes, the frequency of homo¬ 
zygosis in individuals derived from tetrads of rank 2 or higher depends 
on the frequency of progressive exchanges, (?, which equals 1-F-E . 
Thus in suitable material the values of F and E calculated from the 
recombination frequencies can be subjected to additional tests. 

(Such a notation was presented orally, but is omitted from the 
written proceedings subject to publication elsewhere.) 

J. Lederberg. Professor Fisher has given us a strong justification 
of what he terms a genetic as opposed to a cytogenetic approach to the 
problem of interference in crossing-over. We shall, however, all be 
interested to see in the further development of this analysis the extent 
to which his genetic formulation is consistent with, or even relevant to, 
cytogenetic data on interference in the four-stranded bivalent in which 
the physical phenomenon of crossing over is believed to occur. 

In bridging this gap, the usual procedure is the construction of 
many tetrad diagrams in which the effects on the gametic output of 
combinations of crossovers involving various strands at different levels 
are worked out geometrically. The geometry is, however, sufficiently 
simple to be amenable to convenient notation, by the use of which the 
effects of crossing over can be elucidated algebraically. An algebra is 
most useful perhaps in respect to the formulation of theories in terms 
of restraints on the values of its terms, which may be more easily 
manipulated than restraints geometrically expressed. 
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R. A* Fisher. In reply to Mr. Catcheside, I am surprised and some¬ 
what taken aback that be should be of the opinion that chromatid inter¬ 
ference is absent in Drosophila melanogaster . Perhaps he only means 
that the interference observed could conceivably be explained without 
specifically chromatid interference; but this seems to be quite another 
matter. It is at least obvious that the Drosophila data accords excel¬ 
lently with the theory here under discussion. 

In reply to Dr. geiringer : The new theory does supply formulae, 
in some eases evaluable easily, and in others with more difficulty, 
which give the frequencies of all possible recombinations to be 
found in gametes from a multiple heterozygote. These are given in 
terms of the metric-values of the different segregating loci relative to 
the centromere, of the arm lengths of the two parts of the chromosome, 
and, possibly, of an additional parameter available for specifying the 
intensity of interference. 

In respect to the points raised by Dr. weinstein and others in 
the discussion, I should have made it clear that my theory does not 
depend on the whole complexity of the configurations possible in the 
four-strand stage, but only on the frequency distribution of the dif¬ 
ferent configurations possible to a single strand received by a gamete. 
This simpler distribution is sufficient for the purely genetic purpose of 
predicting the frequencies of the different genotypes derived from a 
multiple heterozygous parent. 



LA RELATION D’ALLOMfiTRIE 

SA SIGNIFICATION STATISTIQUE ET BIOLOGIQUE* 

Georges Teissier 
Station Biologique de Roscoff 

Le chapitre de Petude de la croissance dont les progr&s r6cents ont 
4te les plus remarquables est sans eontredit eelui qui traite des change- 
ments de proportions des differents organes ou, autrement dit, de la 
croissance relative. Les recherches sur cette question, fort nombreuses 
depuis une vingtaine d'annees, ont fourni mati&re a plusieurs mises au 
point, inegalement developpees. 1 

La question est aetuellement si vaste et si complexe, qu’il apparait 
qu’en un expos6 limite, il faut se contenter, ou de ne traiter que les 
grandes lignes du sujet, en laissant de cote, non settlement bien des 
points importants, mais aussi la plupart des difficulty qui y figurent 
encore, ou de s’attacher a une question beaucoup plus limitee, en essay- 
ant de Papprofondir. C’est & ce dernier parti que je me suis arrete 
et il ne sera question ei-apres que d’un probleme unique, qui, & la 
verite, se trouve au centre meme du sujet. Dans ce qui va suivre, nous 
allons chercher la signification exacte de la relation d’allometrie qui, 
depuis plus de vingt ans est Pinstrument indispensable de toute 6tude 
sur la croissance relative. 

Le fait fondamental en la matiere est Pexistence d’un type de rela¬ 
tion qui, convenablement a juste, permet de deerire exactement les 
croissances relatives les plus diverses. Chez tous les animaux appar- 
tenant a tous les grands embranchements etudies jusqu’h present, les 
relations entre, d’une part, les dimensions ou le poids d'organes tres 
differents par leur nature, leur structure ou leur fonctionnement et, 
d’autre part, les dimensions ou le poids du corps tout entier, ou d’un 
organe pris comme reference, peuvent etre traduites par des equations 
de la forme: 

y = bx a 


* Lecture delivered September, 1947, before the First International Biometric Con¬ 
ference. 

1 G. Teissier, Travauat Station Biologique de Roscoff , 1931, 9; J. S. Huxley, Problems 
of Relative Growth, London, 1932; G. Teissier, Act. Sc. Hermann 95, 1934; Annales 
Physiologic et PhysicchChimie Biologique, 12, 1936, et Act. Sc. Hermann, 1936; E. C. R. 
Reeve et J. S. Huxley; P. B. Medawar; 0. W. Richards et A J. Kavanagh, Essays on 
Growth and Form, Oxford, 1945. 
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ou 6 et a sont des constantes. Bn coordonnees logarithmiques, ces arcs 
de conrbe puissance se traduisent par des droites de pente a: 

log y = log b + a log x 

Lorsque la croissance d’un organe est plus rapide ou moins rapide que 
celle du reste du corps, ou de 1’organe de reference, et que la relation 
precedente est verifiee, nous disons qu’il y a allometrie . Si a > 1, 
1’allometrie est dite positive ou majorante; si a < 1 elle est negative ou 
minorante. Dans le cas ou a = 1, la croissance de l’organe et celle du 
corps sont parall&les: il y a isometrie. 

Des deux parametres qui d&Bnissent la relation d’allometrie, le plus 
important a, pente de la droite figurative du ph4nom£ne en coordonnees 
logaritlimiques, sera la constante actuelle d’equiUbre; b qui represente 
la taille de l’organe y lorsque x = 1 a la meme signification que les 
indices dont font usage les anthropologistes: ce sera 1 ’indice origine . . 

Avant de commencer un examen approfondi de cette relation, il n’est 
pas inutile, pour justifier le temps que nous lui eonsacrerons, de rap- 
peler 1’importance du role qu’elle joue actuellement dans l’etude de la 
croissance relative et dans la biometrie generale. 

En premier lieu, nous devons insister sur le fait qu’elle traduit en 
general avec la plus grande fidelite les modalit4s de la croissance 
moyenne, On pourrait le prouver par plusieurs centaines d’exemples 
egalement probants. L’etude statistique des cas pour lesquels on 
possSde des mesures suffisamment nombreuses permet par ailleurs de 
demontrer que les ecarts qui se produisent de part et d’autre de cette 
loi moyenne peuvent etre consideres comme fortuits. Elle permet en 
outre de comparer, avec autant de securite, des animaux adultes appar- 
tenant soit a la meme espSce, soit a des espeees differentes. C’est k 
ce titre qu’elle est devenue un instrument tres efficace dans 1’etude des 
series phylog4niques ou dans la recherche des affinites des races d’une 
meme espSce ou des espeees d’un meme genre. 

Certains organes ou certaines parties du corps, dont la croissance 
ne peut etre representee par une seule courbe puissance, suivent sue- 
cessivement plusieurs relations d’allometrie. Sur les graphiques log¬ 
arithmiques les droites peuvent se raccorder par un point anguleux, ou 
etre separ4es par une discontinuity plus ou moins ample, qui correspond 
k un stade ou le poids (ou la longueur) de l’organe n’est plus fonction 
du poids (ou de la longueur) du corps. Une courbe de croissance rela¬ 
tive peut ainsi presenter deux sortes d’accidents et se decompose dans 
les deux cas en trongons r6guliers exactement definis. 
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Les points anguleux et les discontinuity qui jalonnent les conrbes 
de croissance des diff4rents organes d’un meme animal ne sont pas 
repartis de fa$on quelconque, mais apparaisscnt an eontraire simul- 
tanement dans des organes divers. La croissance d’un animal se divise 
ainsi naturellement en etapes distinctes. Pendant chacune de ces 
etapes, les divers organes et les divers constituants biochimiques crois- 
sent, Tun par rapport a P autre, en suivant des relations d’allometrie 
simples, la composition chimique, la structure et la forme variant de 
fa§on continue. Les etapes successives sont s4par6es Pune de P autre 
par des stades critiques de dur4e assez breve ou, sans que le poids ou la 
taille augmente sensiblement, les regies—qui presidaient jusque Ik k 
la croissance changent brusquement. Ces stades critiques se marquent 
sur les diverses courbes de croissance relative par des points anguleux 
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ou des discontinues. Du point de vue biochimique ils correspondent 
a des changements plus ou moins profonds dans P6quilibre reciproque 
des divers constituants de Porganisme. Du point de vue physiologique, 
ils correspondent frequemment a des modifications importantes dans 
le jeu des correlations bumorales. 

Cette conception de la croissance relative, que j ’ai d6veloppee autre¬ 
fois me parait, encore aujourd’hui, correspondre assez exactement k 
la reality bien que sur certains points, notamment en ce qui concerne le 
role des stades critiques, Paccord des specialistes soit loin d’etre fait. 
II n’est pas question de la defendre iei mais, qu’elle soit exacte ou 
qu’elle soit fausse, son succ&s definitif, ou son 6ch.ec, d6pendront en 
definitive de la validite plus ou moins grande de la relation d’allometrie. 
Aucune des interpretations physiologiques, voisines d’ailleurs Pune de 
Pautre qui ont pu etre suggerees par Huxley, par Robb, ou par moi- 
meme, n’a jusqu’ici recueilli l’assentiment g6neral. Peut-etre en 
trouvera-ton une plus satisfaisante, mais il restera encore a expliquer 
que la meme formule convienne a tant de cas si disparates et a com- 
prendre sa signification g6nera!e. 

Avant d’aborder ce problems, une remarque prealable est indispens¬ 
able. La raison essentielle du succ6s de la notion d’allometrie est 
qu’elle ne fait pas intervenir l’age des individus dont elle compare les 
earacteristiques. C’est grace k cette circonstance qu’ont pu etre faites 
des etudes tres precises sur la croissance d’animaux qu’il est difficile, 
ou meme impossible, d’elever au laboratoire. 

Une telle caracteristique n’est evidemment pas liee a la forme meme 
de la relation d’allom6trie et toute autre relation cj> ( x , y) « 0 entre deux 
grandeurs mesurdes sur un meme animal‘ ‘eourtcircuiterait le temps” 
aussi bien qu’elle. Bile postule, en revanche, la r6alisation pr6alable 
d’une condition rarement 6nonc6e d’une facon explicite, parce que 
sans doute trop 6vidente, mais qui est n6cessaire k la r6ussite d’une 
6tude sur la croissance relative. II faut que, chez Porganisme 6tudi6, 
la forme, la structure ou la composition chimique d’un individu place 
dans des conditions donnees, ne d6pendent que de ses dimensions et non 
pas de son age. 

Qu’une telle condition puisse etre realisee au cours de la croissance 
n’est nullement une necessite logique. Elle apparait comme une rfcgle 
pratique qui se verifie, rigoureusement pour de nombreuses especes, 
approximativement pour beaucoup d’autres, pendant la plus grande 
partie de leur existence et qui, chez elles, ne se trouve 6ventuellement 
en d6faut qu ’& certains stades critiques du d6veloppement, ceci n’ex- 
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eluant naturellement pas la possibility pour quelques especes, de 
repondre par un changement de forme a certaines modifications de leur 
milieu. II est encore beaucoup trop tot pour tenter de dresser la liste 
des especes appartenant a cette cat4gorie d’etres k forme stable, mais 
on doit cependant citer au premier rang d’entre eux le grand groupe 
des Arthropodes. Comme par ailleurs, ces animaux se pretent a des 
mesures multiples et g4n4ralement beaucoup plus precises que eelles qui 
peuvent elre pratiquees sur les representants des autres embranche- 
ments, ils constituent, avec certains Vertebr4s, un mat4riel de choix 
pour l’4tude des lois de la croissanee relative. II semble, en revanche, 
a en juger du moins par les recherches des auteurs japonais, que les 
Lamellibraneh.es se pretent mal a ce genre d’investigation, leur forme 
d4pendant assez largement a taille egale, non seulement des conditions 
de milieu, mais peut-etre aussi directement de l’age. 

Nous laisserons de cot4 les animaux appartenant a cette deuxieme 
cat4gorie pour nous occuper uniquement de ceux chez qui, pendant une 
partie au moins de la croissanee, deux lots d’animaux de meme taille 
et d’age different ne sont pas discernables Tun de 1’autre et chez qui, 
r4serve faite de la variabilit4 specifique, la forme est fonction de la 
taille. 

H existe une autre circonstance, toute different^ en son principe de 
la pr4c4dente, ou il est 14gitime de comparer les dimensions x et y de 
deux organes en faisant abstraction du temps, e’est eelle oh. la com- 
paraison porte sur des animaux ayant cesse de croitre et chez qui, pour 
un temps, ou d4finitivement, forme et taille sont comme figees. II en 
est ainsi pour les Mammiferes adultes et pour les imagos d’Insectes et 
Von sait que e’est sur des animaux appartenant k ces groupes et rem- 
plissant ces conditions qu’ont 4te faites quelques-unes des premieres ap¬ 
plications des relations d’allom4trie. Celles qui portent sur les Mam- 
miferes essaient en gen4ral de d4crire les relations d’allometrie pouvant 
exister entre les structures d’especes voisines mais tr&s diff4rentes de 
taille, tandis que celles qui portent sur les Insectes tentent d’analyser 
les differences de forme corr41atives des differences de dimensions a 
l’int4rieur d’une meme espece. Depuis, allom4trie de taille et allo- 
metrie de croissanee ont 4te 4tudiees, a l’aide des meme methodes, par 
les memes auteurs sans qu’aucun d’eux ait r4ussi k justifier entierement 
l’emploi qu’il faisait d’une meme formule pour traduire des relations 
aussi dissemblables que celles qui peuvent unir un jeune animal et 
l’adulte eorrespondant, deux adultes de la meme espfece, ou deux adultes 
d’esp4ce diff4rente. Encore moins a-t-on r4ussi k expliquer que la 
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relation d’allometrie puisse etre un instrument puissant d’analyse dans 
l’4tude des variations intraspecifiques et interspecifiques de la com¬ 
position chimique des organismes. 

C’est & la recherche d’une interpretation g4n4rale de la relation 
d’allometrie que va etre consacre le pr4sent expos4. 

I 

Plusieurs sch4mas physiologiques, assez voisins d’ailleurs Tun de 
l’autre, ont 4t4 propos4s pour interpreter l’allometrie de croissance. 
Si aucun d’eux n’a, jusqu’a pr4sent/recueilli l’assentiment gen4ral des 
biologistes, il est permis d’esperer qu’un schema analogue, mais plus 
parfait, sera reeonnu quelque jour comme satisfaisant. II est certain, 
en revanche, qu’un sch4ma mecaniste de ce type ne pourra s’appliquer 
qu’aux cas ou la comparaison porte sur les stades successifs du d4vel- 
oppement d’un meme individu et a ceux ou les individus compares 
peuvent legitimement etre consideres comme fournissant 1’image des 
diflrerentes etapes d’une meme evolution. II perdrait toute significa¬ 
tion si Ton voulait l’appliquer aux cas oh la comparaison porte, en 
principe comme en fait, sur des individus differents et ofi la relation 
d’allometrie condense en une formule une infinite de possibilit4s dont 
chaque individu ne peut, dans les circonstances les plus favorables, 
r4aliser qu’une seule. J’ai propose en 1937, une interpretation statis- 
tique de cette allometrie de taille et j’ai pense a cette4poque qu’il serait 
deraisonnable d’essay er de la transposer au cas de Vallometrie de crois¬ 
sance . Je crois cependant aujourd’hui & la possibilit4 d’une interpr4- 
tation des ph4nomenes d , allom4trie qui engloberait non seulement tous 
les f aits morphologiques, mais aussi les f aits bioehimiques pour lesquels 
il n’a, jusqu’h pr4sent, pu Stre trouve d’interpretation autre que pure- 
ment formelle. 

Le point de depart de cette conception unitaire est dans l’interpre- 
tation statistique de 1’allometrie de taille convenablement precisee. 

Soit done une population homogene d’individus ayant atteint leur 
taille definitive et sur chacun desquels peut-etre pratiqu4e avec pr4- 
cision la mesure d’un certain nombre de grandeurs x, y, z . . . . 
Portons notre attention sur deux de ces variants x et y . Le r4sultat 
des mesures est un nuage de points (x, y) dont la forme est d’autant 
mieux definie que le nombre d’individus mesures est plus grand. Nous 
nous limiterons aux cas, heureusement nombreux dans la pratique, ou 
la corr41ation entre les deux grandeurs consid4r6es est forte; ce sont 
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d’ailleurs les seuls pour qui la notion d’allometrie puisse presenter un 
veritable interet. Le nuage figuratif de la population des mesures est 
alors tres allonge. II s’agit maintenant de preciser sa forme. Sur ce 
point, d’ailleurs capital, il n’est malheureusement pas possible d’arriver 
a une certitude absolue. II se trouve que, dans bien des cas, le nuage 
peut etre considere comme sensiblement normal, en ce sens que ses 
points se repartissent a peu pres suivant une loi de Laplace-Gauss a 
deux variables et qu’il est allonge suivant une droite. Mais il se trouve 
aussi que, bien souvent, les memes mesures traduites dans le systeme 
log x, log y donnent encore un nuage de points sensiblement normal, 
allonge suivant une droite. A titre d’exemple de ce fait tres general, 
je citerai les Maia squinado males adultes, chez qui le coefficient de 
correlation entre la longueur du propodite de la pince et la longueur 
du eephalothorax est 0.957, et le coefficient de correlation entre les 
logaritbmes de ces grandeurs 0.965, le nuage logaritbmique etant peut- 
etre un peu mieux allonge suivant une droite que le nuage arithmetique 
et pouvant etre, au meme titre que ce dernier, consider^ comme normal 
L’ambiguite de ce resultat ne tient evidemment qu’a l’insuffisance du 
nombre des individus mesures et, en augmentant 1’importance de 
l’eehantillon de la population etudiee, on doit pouvoir faire le choix 
qui s’avere impossible sur un lot de 300 animaux, a moins que, ce 
faisant, on ne tombe sur un resultat incompatible avec l’une et l’autre 
des deux hypotheses envisagees. Dans la pratique, avec les populations 
de mesures dont on dispose, Tineertitude est tres frequente. 

II peut certainement qrriver que le nuage de points soit plus allonge 
et plus rectiligne en coordonnees arithmetiques qu’en coordonnees 
logarithmiques et que la loi de distribution soit nettement du type 
Laplace-Gauss. II est certain, en revanche, que, dans beaucoup de cas, 
notamment lorsqu'une des mesures porte sur un poids et 1’autre sur une 
longueur, le nuage est plus normal lorsqu’il est construit k partir du 
logarithme des mesures que lorsqu’il est fait a partir de ces mesures 
memes, la distribution etant alors du type Galton-MacAlister. Mais 
il arrive aussi, et je Dai constate notamment dans Detude des relations 
entre le poids des elytres et celui des mandibules chez Lucanus cervus, 
que le nuage ne soit normal, ni dans un systeme, ni dans Tautre, et ne 
s’allonge suivant une droite que dans sa partie mediane. 

Laissant de cotg ces distributions complexes, encore tres insuffisam- 
ment connues, il apparait que la situation est la suivante: dans le plus 
grand nombre des cas, parmi ceux dont l’etude a ete la plus complete 
et la plus rigoureuse, deux lois semblent pouvoir actuellement etre 
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employees aussi 14gitimement rune que 1’autre. Pour pouvoir justi- 
fier, par des criteres statistiques rigoureux, les choix que Ton est amen4 
a faire entre elles, il faudrait accroitre demesur4ment 1’importance du 
materiel etudie. Encore n’est-il pas certain que ce faisant, on ne 
d4couvrirait pas qu’une troisieme loi, plus complexe que les deux 
autres et intermediate en quelque sorte entre elles, ne conviendrait pas 
mieux aux faits observes. II apparait, dans ces conditions, que l’on est 
actuellement en droit d’exprimer la relation stochastique entre les deux 
variables par une r4gression logarithmique aussi bien que par une 
regression lineaire, constatation qui legitime en une certaine mesure, 
l’emploi de la relation d’allometrie sans d’ailleurs justifier la place 
qu’a prise cette relation dans la biometrie au detriment de la relation 
lineaire, qui peut cependant pretendre, aussi bien qu’elle, a reprfeenter 
les faits observes. Bien que le fait que, dans un certain nombre de cas 
bien etablis, la regression logarithmique donne un meilleur ajustement 
que la regression lineaire plaide en faveur de la relation d’allometrie, 
il reste a justifier par des arguments biologiques cette preference 
gen4rale. Nous les trouverons en approfondissant la notion de varia¬ 
bility telle qu’elle est donn4e par l’observation de la nature d’une part, 
par la description statistique des faits d’autre pari 

Nous rappelerons tout d’abord qu’aux yeux du biologiste la varia¬ 
tion relative est plus significative que la variation absolue. Deux or- 
ganes de taille differente d’un meme animal sont consideres comme 
egalement variables, non pas si leurs dimensions presentent des fluctua¬ 
tions 4gales, mais si ces fluctuations ont la meme importance relative ; 
la variabilit4 s’evalue traditionnellement non pas en unit4s metriques 
mais en pourcentages. Il apparait naturel, dans ces conditions, de 
dire que deux organes sont egalement variables si leurs coefficients de 
variabilite sont 4gaux et de mesurer par ce paramfctre 1’importance de 
cette variabilit4. Il s’agit la d’une definition dont il est naturellement 
impossible de d4montrer le bien fonde mais qui apparait en parfait 
accord avec l’id4e que les biologistes se sont fait de la variabilit4. Elle 
admet que, dans des organes egalement variables et de dimensions 
moyennes differentes, l’ecart type est proportionnel a la taille, fait en 
parfait accord avec cette constatation empirique que les coefficients de 
variability d’organes homologues d’animaux de taille tres diff4rente ne 
presentent pas de variations systematiques avec cette derniere et que 
les objets que le biologiste est port! a consid4rer comme 4galement 
variables ont toujours sensiblement meme coefficient de variabilite. 

Cette definition de la variability a plusieurs cons4quences int4r- 
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essantes. La premiere apparait d§s que l’on examine les raisons qui 
peuvent etre donnees pone expliqner qne les variations qnantitatives 
d’un 4tre vivant puissent etre tradnites par une loi normale ou sensible- 
ment normale. Ces raisons sont ealquees sur celles qni sont utilisees 
pour expliqner la validite de la meme loi dans d’autres domaines et pins 
particulierement dans la theorie des erreurs, et il n’y a pas lien de nous 
y attarder bien longnement. Si des animaux adnltes de meme esplce 
et de meme sexe ne sont pas identiques, c’est, d’une part, qne lenr con¬ 
stitution g4n4tique presente des differences insoup§onn6es et, d’autre 
part, quails n’ont jamais vecu dans des conditions rigoureusement 
identiques. II pent arriver qne les differences de taille soient dues, 
pour la pins grande part, k nn on a nn petit nombre de factenrs pre- 
dominants, certains genes par exemple; mais on doit admettre qu’en 
general elles r4sultent de l’interference d’innombrables actions, ayant 
cbacnne nn effet tres minime, qni se sont prodnites a nn stade on a nn 
autre dn developpement, sons l’influence des conditions dn moment. 
Si nous admettons qne, pour la plupart an moins, ces causes de varia¬ 
tion sont independantes et qne lenrs effets sont additifs, hypothese qne 
nous anrons a examiner de pins pres dans nn instant, il est possible, 
comme on sait, de d4montrer qne la distribution finale sera d’autant 
pins proche d’une distribution normale qne le nombre des causes de 
variation sera pins grand et lenr effet individuel pins petit. La 
demonstration s’etend an cas de distributions k deux variables, la 
correlation s’expliquant, fait pour nous tres important, par le fait 
qn ’un grand nombre des causes qui font varier Tune des grandeurs 
agissent egalement snr rautre et dans de meme sens. Pour expliqner 
qne certaines distributions s’eloignent nettement de la normale il 
suffit d’admettre qne certains factenrs preponderant dn developpe¬ 
ment n’aient pas nne distribution normale dans la population, qne, par 
exemple, nne partie des animaux ait ete sonmise, a nn certain moment, 
a des conditions tres differentes de celles ofi restaient places les antres, 
on encore qn’nne partie des individus soit portense d’nn gene modifiant 
les dimensions dn corps tout entier on celles de 1’nne de ses parties. 2 

Consid4rons alors deux races de taille moyenne in4gale et snpposons 
les si voisines l’une de 1’autre dans lenr forme et lenr structure, qne 
ies senles differences existant apparemment entre elles soient d’ordre 
m4trique; snpposons les en outre egalement variables, an sens precis 
du terme, 1’ecart type 4tant proportionnel a la taille moyenne. D ’apres 

8 Dans ce cas on pourrait s’attendre & trouver des distributions de Pearson ou de 
Gram-Charlier. 
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ce qui precede, il faudra pour qu’il en soit ainsi que le nombre des 
causes perturbatrices, ou la grandeur de leur effet moyen, ou encore le 
nombre et la grandeur des perturbations 414mentaires, augmentent avec 
la taille. Plus precisement, il faudra, dans les deux cas extremes, ou 
que, les perturbations conservant en moyenne la meme importance, leur 
nombre soit sensiblement proportionnel k la racine carr4e de la taille, 
ou que, ce nombre restant du meme ordre, 1’importance moyenne de 
cbacune d’elles doit sensiblement proportionnelle a la taille. Il est 
elair que cette deuxieme bypoth4se est beaucoup plus raisonnable que 
la premiere et que l’on soit admettre que, lorsque la distribution est 
normale, les causes qui tendent k ^carter un individu de la moyenne ont 
des effets sensiblement proportionnels a cette moyenne. Cette hy- 
poth4se dictee par des considerations theoriques a d’ailleurs un fonde- 
ment biologique solide; elle signifie que les facteurs modificateurs de 
la croissance ont des effets multiplicatifs et non pas additifs, qu’ils 
agissent non pas en aceroissant ou en diminuant la taille d’une certaine 
fraction de centimetre ou de millimetre, mais en l’accroissant ou en la 
diminuant d’une certaine fraction de cette taille. Ce fait parait de 
plus en plus solidement etabli pour les f acteurs genetiques de la crois¬ 
sance et, en ce qui concerne les facteurs externes, toutes les interpreta¬ 
tions physiologiques des lois de la croissance globale le postulent plus 
ou moins expressement. 

Autant de raisons pour accepter cette hypothSse qu’il devient fort 
interessant de pousser plus loin. Si l’effet d’un faeteur apparait 
comme multiplieatif lorsqu’on compare deux races de taille inegale 
d’une meme espece, on ne voit pas pourquoi il ne le serait pas aussi a 
l’interieur d’une meme race lorsque Ton compare les individus les 
plus grands et les plus petits, on ne eomprend pas pourquoi l’effet d’un 
faeteur sur un individu determine serait proportionnel a la taille 
moyenne de l’espece et non pas a sa taille propre. On est ainsi conduit, 
par une generalisation toute naturelle, k interpr4ter la variability d’une 
population adulte par Interference d’innombrables actions el4men- 
taires dont cbacune augmente ou diminue d’un pourcentage tr4s min- 
ime la taille de 1’individu sur lequel elle s’exerce. Admettre qu’il en 
est ainsi revient k dire que la distribution des logaritbmes des tailles 
doit etre normale, ou que la distribution des tailles elles memes est du 
type de Galton-MacAlister. Les raisonnements pr4c4dents s’4tendant 
immediatement aux distributions a plusieurs variables, la r4gression 
logarithmique apparait comme mieux fond4e biologiquement que la 
regression lin4aire. 
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Cet argument en faveur de la relation d ’allometrie peut etre 
presents d’une autre fagon qui a l’avantage de eoniporter moins d’hy¬ 
potheses. On sait qu’une caracteristique des distributions normales 
a deux variables est d’etre homoseedastiques, c’est-a-dire d’avoir, pour 
chaque variable, un ecart type lie independant de la valeur particuliere 
de l’autre. Autrement dit, si les individus sont classes selon les valeurs 
croissantes de x, 1’ecart type de y sera le meme pour les individus les 
plus grands que pour les petits et, de meme, si le classement est fait par 
rapport a y, l’Scart type de x restera constant tout au long de la dis¬ 
tribution. Si la distribution normale est celle des mesures directes, 
il s’ensuit que le coefficient de variabilite des grands individus sera 
inferieur a celui des petits. Si la distribution normale est celle des 
logarithmes des longueurs le coefficient de variabilite sera au contraire 
pratiquement independant de la taille. Cette remar que ne fournit pas 
pas moins un reel interet. On n’a jamais remarque en effet que, dans 
chaque cas particulier, les individus de taille extreme 4tant generale- 
ment trop peu nombreux pour qu’un coefficient de variabilite puisse 
etre determine sur eux avec quelque sScurite, mais elle n’en presente 
pas moins un reel interet. On n ’a jamais remarque en effet que, dans 
chaque espece, les animaux les plus grands soient moins variables que 
les plus petits; tout porte a eroire, au contraire, qu ’il n ’en est rien et 
cet argument en faveur de la relation d’allometrie merite d’etre 
retenu. 3 

Tenu compte de tout ce qui precede, nous opterons pour la relation 
d’allometrie eontre la relation lineaire, en saehant bien cependant que 
ce choix raisonnable n’est pas obligatoire, puisqu’en mati&re de liaison 
stochastique, il reste toujours place pour un large arbitraire dans le 
choix des formes analytiques. Nous pourrons meme dans certains cas 
user de cette latitude pour utiliser alternativement, selon le but pour- 
suivi, l’une et l’autre des deux relations coneurrentes. 4 


3 On devrait constater par exemple que le coefficient de variability du propodite de la 
pince des Maia squinado mfi.les est de 9.8 pour les animaux mesurant 140 mm et de 4.1 
pour ceux qui atteignent 220mm, avec une valeur voisine de 5.8 pour les a nim aux de 
taille moyenne 180mm, les changements de valeur de V dans un intervalle d’environ 2 
hearts types de part et d’autre de la moyenne etant particuli^rment important^ dans ce cas 
ou la variability de l’organe considdrd est trds grande (V = 21). Pour le m4ropodite du 
4eme pdr4iopode dont la variability, encore, forte, est cependant plus normale (V = 12.7) 
les coefficients de variability correspondent aux monies dimensions du cdphalotborax 
devraient etre respectivement 4.7; 2.8 et 3.5. Rien de tel n’a 4te observy. 

' ‘Voir en particulier l’nsage que j’ai fait de cette possibility dans une comparaison 
biomStrique de deux espfeces du genre Mata, M. squinado et M. verrucosa (Oomptes 
Rendus de VAcademie des Sciences, T 204, p. 67, 1937). 
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Le senl argument qui pourrait incliner a preflrer la traduction de 
la relation existamt entre les deux variants par une formule lineaire 
plutot que par une formule logarithmique est que, si deux parties d’un 
mime organe companies a une mime grandeur de rlflrence oblissent 
respectivement a des relations d’allometrie de constantes a x h ; a 2 b 2 , il 
est impossible, si a*, et a 2 sont difflrents, que l’organe tout entier obeisse 
& une relation de la mime forme. La difficult! ne se prlsente pas pour 
la relation linlaire oil il suffit de poser : 

y~yi +$2 a =<h + <k 

L’objection serait valable s’il s’agissait de relations fonctionnelles 
puisque dans ce cas, les seules relations repondant a la condition 
d’additivitl doivent etre de la forme y = a $ (x) + 1, condition que ne 
remplit pas la relation d’allometrie, ni aucune de celles que Ton pour¬ 
rait songer a lui opposer, puisque la reciprocity entre y et x exigerait 
que Ton ait en meme temps x-a' <f> (y) + V, ce qui nous fait retomber 
sur la relation lineaire. J’ai montre, il y a plus de quinze ans, que 
cette objection est sans valeur dans le probllme qui nous occupe, ofi la 
relation chercbee est stocbastique. Sauf en des cas exceptionnels, on 
obtiendra une relation satisfaisante en prenant pour a la moyenne 
ponderle de ai et 02 : 

a _ a i + ^2 $2 _ cli + a 2 (ai — <x 2 ) (j/i - yp) 

Vi + y2 ~ 2 yi + V2 

La difference entre cette valeur moyenne de a et celle qui conviendrait 
pour rextremit! de la distribution ou l’ecart est le plus grand est 
voisine de V w (on-a*) 2 . Elle est presque toujours dans la pratique 
inferieure a l’erreur que 1’on peut commettre dans la ditermination 
d’une constante d’allomltrie, qui est, nous le verrons, de Tor dr e de 

aVT^/VF* 


II 

Avant de poursuivre notre analyse de la notion d’allometrie, il nous 
faut traiter un probleme fort important en lui-meme et dont la solu¬ 
tion est moins simple qu’on ne l’a era. Une fois admis qu’une rela¬ 
tion d’allomltrie peut exister entre deux variables, il faut, pour veri¬ 
fier cette hypothese, disposer d’un methode rlgulilre qui permette de 
dlterminer les constantes qui la dlfinissent et la prleision des chiffres 



LA RELATION D'ALLOMETRlE 


27 


obtenus. 5 TJne premiere question se pose immediatement. L’ajuste- 
ment doit-il etre pratique sur Pexpression y =» lx a ou sur Pexpression 
equivalente log y - a log x + log 6. Les arguments donn6s en faveur de 
Pajustement de la premiere formule ne sont pas particulierement con- 
vaincants. Les calculs sont d’une conduite plus difficile et donnent 
d’ordinaire, au surplus, des resultats fort peu differents de ceux que 
Pon obtient en utilisant la relation sous sa forme logarithmique. 
Comme par ailleurs nous avons conclu de P4tude precedent© que la 
distribution de Y = log y et X * log x doit en principe etre normale, il 
apparait naturel de faire tous les calculs sur la deuxieme expression 
que nous ecrirons Y = a X + B et dans laquelle il s’agit de determiner a 
et B. 

La premiere solution qui se presente k Pesprit est celle qui consiste 
a identifier la droite cbercbee ayec la ligne de regression de Y en X . 
Dans le cas ofi nous nous sommes places par bypotbese, celui de Petude 
d’un 4ebantillon representatif de la population globale, cette solution 
ne peut pourtant pas etre retenue. Elle suppose en effet qu’on est en 
droit de faire jouer un role different aux deux variables, Pune X etant 
consideree comme independante et Pautre Y comme dependant©. Il 
peut 4videmment arriver que cette bypotbese soit acceptable, si, par 
exemple Y est la mesure d’un organe de petite taille et X celle du corps 
tout entier, mais en general cependant, X et Y jouent des roles symetri- 
ques. Si Pon doit comparer deux mesures lineaires du corps d’un ani¬ 
mal, longueur et largeur du eepbalotborax d’un crabe par exemple, ou 
de deux appendices bomologues, on ne voit pas bien de quel droit on 
considererait Pune plutot que Pautre comme representant la variable 
independante, ni par consequent comment on pourrait cboisir entre les 
deux lignes de regression. L’emploi de ces derni^res ne se justifie que 
si, pour des raisons pratiques, on veut pouvoir calculer a partir de Pune 
des dimensions prise, par convention, comme reference, la valeur la 
plus probable de Pautre, mais dans ce cas on ne travaillera sans doute 
pas sur un ecbantillon representatif de la population et Pon ne se trou- 
vera done pas dans le cas consid4re actuellement. Cet emploi n’est 
generalement pas licite, les deux variables etant egales en dignite, k 
moins que Pon ne se resigne a donner au problem© pose, non pas une, 

5 Le probifeme a dtd dtudid en detail par Kavanagh et Richards (Proceedings of the 
Rochester Academy of Science, 150, 1942) qui ont montrd que le traitement mathdma- 
tique doit varier avec la nature des denudes de Tobservatlon et le but poursuivi. Je ren- 
verrai k cet excellent travail pour tout ce qui touche k l’historique de la question, k 
l’interprdtation de & et & sa prdtendue correlation avec a. Us ont justement insistd sur 
la ndcessitd qu’il y a dans certains cas, k faire jouer k y et w un rdle symdtrique dans les 
calculs, mais sont restds dans le cadre des techniques traditionnelles. 
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mais deux solutions qu’il faudra toujours considerer simultanement. 
Si nous voulons arriver k une droite unique, la “meilleure” possible, 
nous devons renoneer au crit&re usuel de la quality d’un ajustement et 
k l’emploi de la methode des moindres carres dans sa forme tradition- 
nelle, rien n’autorisant a rendre minimum plutot la somme des carres 
des hearts des Y k X constant que celle des X k Y constant. Le pro¬ 
blems est tres voisin de eelui qui se pose lorsqu’il s’agit d’ajuster une 
droite k un ensemble de couples de mesures ou les deux variables sont 
sujettes k l’erreur, probleme qui n’a 4t4 que peu travaille jusqu’a 
une date r4cente et qui n’a pas encore regu de solution vraiment satis- 
faisante. La meilleure parait encore celle qui rend minimum la somme 
des carres des distances des points k la droite; la droite obtenue se con- 
fond, on le sait, avec le grand axe des ellipses d’4gale distribution. Du 
point de vue g4om4trique la solution est parfaitement raisonnable; du 
point de vue statistique elle Test moins, car on ne voit pas tres bien ce 
que peut representer en terme d’erreur la distance d’un point figuratif 
a la droite. 6 II m’a semble qu’un autre principe de minimum pourrait 
etre utilement envisag6 et je propose de representer la relation d’allo- 
m4trie par la droite (D) qui rend minimum la somme du produit des 
ecarts de Y pour X constant et de X pour Y constant, ce qui revient a 
rendre minimum la somme des aires des triangles rectangles ayant pour 
hypoth4nuse commune la droite eherchee, deux cotes paralleles aux 
axes, et leurs sommets aux points figuratifs. 

Un tel proc4d4 n’est au fond que la transposition dans le domaine 
du calcul d’une technique d’interpolation graphique employee sys- 
tematiquement par certains dessinateurs et que Ton utilise inconsciem- 
ment lorsque Ton essaie de placer au mieux une droite au travers d’un 
nuage de points approximativement alignes. 

On voit sans peine que la droite cherch4e (D) passe par le centre de 
gravit4 X, Y f de la distribution. Sa pente est donnee^par la valeur 
de a qui rend minimum la somme des produits [(Y-Y) -a(X-X)] 
[ (X - X) - (Y - Y)/a] , c ’est a dire par a 2 =* or 2 /crz 2 . La solution posi¬ 
tive, a = (ty/ctx donne la param4tre cherche, la solution n4gative, a' » - a, 


6 Un autre inconvenient de la solution classique est que la droite obtenue depend des 
unitds adoptees pour la mesure des deux grandeurs compardes II suffit d’ailleurs pour 
y parer de normaliser les coordonndes, e’est k dire de prendre pour unites de mesure les 
dcarts types des deux grandeurs. Mais si un tel artifice peut, k la rigueur, se ddfendre 
lorsque la comparaison porte sur les mesures memos, il ne saurait raisonnablement dtre 
utilise lorsque les variables intervenant dans les caleuls sont des logarithmes. Nous 
verrons cependant que la droit de moindre distance ainsi obtenue se confond avec ceUe k 
laquelle nous arrivons par des considerations trfes difterentes et beaucoup plus acceptables. 




Fig. 3 

Definitions compares de la droite d ’allometrie D et des deux droites de re¬ 
gression B et C et des points correspondant h X x Y x sur D {X\ F^) et sur les 
droites de regression (X a Y'\ et X'\ F x ). 


est la pente de la droite (D') qui rend maximum cette somme de pro- 
duits. 
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Liquation de la droite ( D) s’4crira: 


r_Y = £I(X-X) avec o = — 

<?X <7X 

mais pourra aussi se mettre sous la forme sym4trique: 

r-r _ x-x 

cry crx 

La variance de la constante d’equilibre a est era 2 = a 2 (1 ~ r 2 ) /N, r 4tant 
le coefficient de correlation de X et de Y; elle est 4gale a celle du co¬ 
efficient de regression de Y en X, Les droites (D) et (2)') constituent 
un couple de directions conjuguees pour les ellipses d’egale distribu¬ 
tion et le point X\, Y\ de la droite ( D) correspondant aux mesures 
Xi, Y ly relatives a un individu particulier est situe sur la parallele & 
(D') menee par X ly Y t . Ses coordonnees sorit: 

2 L «Y *2 J 

et Ton peut aussi ecrire: 

x^-x y \- y _ i r xg-r i Xj-x i 

crx cry 2[ or <rx J 

La variance des hearts entre les points figuratifs et les points de la 
droite (D) qui leur correspondent est constante et egale a (ax 2 + ay 2 ) 
(1 - r) /2. La mesure de la surface moyenne des triangles d’ajustement 
joue par ailleurs dans nos calculs le meme role que la variance li4e de 
la methode classique de la regression et peut etre qualifiee de covariance 
liee. Bile est proportionnelle a la grandeur pr4c4dente et constam- 
ment 4gale en tous points de la distribution k ax ay (1 - r ). On voit 
que les formules qui precedent permettent de faire corresponds sans 
ambiguite a tout individu reel de la population Pindividu normal 
fictif qui s’en rapproche le plus et de mesurer en 4carts reduits leur 
degre de dissemblance qui peut etre deflni par Pexpression : 


Ti9sTi errf Fq-Y Xi~X "[ 
2 [ ay ax J 
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Pig. 4 

Les droites D et D' mises en place dans une des ellipses d’4gale distribution: 
B et 0, les deux droites de regression et A, l’axe principal (a = 1,285, r=0,7X). 



L’usage qui vient d’etre fait de la formule d’alloraetrie pour ddfinir 
l’individu tb4orique eorrespondant a chaeun des iudividus r4els 4tu- 
di4s, est celui pour lequpl elle a 4te etablie et auquel elle est le plus 
exactement adaptee. Mais, dans la pratique, elle doit etre u tilis ee beau- 
coup plus frequemment a estimer la valeur de 7 qui correspond nor- 
malement a une certaine valeur de X ou inversement a estimer la 
valeur de X eorrespondant normalement a une certaine valeur de Y. 
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Bile remplace ainsi, suivant le cas, l’une ou l’autre des deux lignes de 
regression de F en X ou de X en Y. 

L’estimation de Y qu’elle apporte differe de eelle que fournit la 

ligne de regression de I en I de (1-r) or ——— Par ailleurs la 

crx 

variance de Pecart entre les Y observes et les Y calculus par cette for- 
mule pour une valeur donnee de X est: 


cry 


2 


(1-r 2 ) 


1 


1-r (X-X) 2 ' 
+ 1 + T crx 2 


que Pon peut aussi ecrire: 


2 cry 2 (1 — r) + cry 2 (1-r) 2 


\x-xy 

crx 2 



Elle depend done de (X-X), sa valeur moyenne est 2cry 8 (1-r). 
Elle n’est egale a la variance li4e de Y, variance de l’4cart entre les Y 
observes et les Y calcules par la ligne de regression, qui est egale, quel 
que soit X, a cry 2 (1-r 2 ) qu’a sa valeur minimum atteinte pour X « X. 
L’amplitude de ces variations est d’ailleurs faible. II est exceptional 
que dans les echantillons de dimensions usuelles elle depasse sensible- 
ment, pour les individus les plus grands ou les plus petits, les erreurs 
commises normalement dans la determination de la variance liee. Pour 
la meme raison, et bien que les estimations fournies par la droite d’allo- 
metrie presentent des ecarts syst4matiques avec celles que permet la 
droite de regression, il est rare, lorsque la correlation est forte, comme 
nous Pavons constamment suppose, que ces ecarts soient notablement 
plus amples que ceux qui se rencontrent normalement entre points ob¬ 
serves et points calcules par Pune ou l’autre droite. II reste 4videm- 
ment possible que, dans certains cas, et pour certains usages, il soit pre¬ 
ferable d’utiliser la ligne de regression classique, mais cette reserve ne 
diminue en rien l’interet de la droite que nous avons definie. 

Une des caracteristiques les plus remarquables de cette droite est 
que, parmi toutes celles qui pourraient etre employees a d4crire la 
liaison de X et de Y, elle est la seule dont la pente soit independante de 
la valeur du coefficient de correlation de X et X. Cette caracteristique, 
qui n’aurait peut-etre pas grand interet dans l’etude d’une distribution 
quelconque, est fort avantageuse dans une etude de morphologic quan¬ 
titative oiL, comme nous le verrons, la valeur numerique du coefficient 
de correlation ne presente pas d’int4ret biologique. Il importe en 
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Fig. 5 

Allom4trie de taille chez les Mala Squinados m§Ies adultes. Longueur du 
propodite de la pince en fonetion de la longueur du m4ropodite du dernier pereiopode 
locomoteur chez 301 individus. Coordonn4es logarithmiques (a =1,57, r = 0,963). 

effet de bien eomprendre que la loi que Ton essaie de d4gager de Ten- 
semble des observations est, en definitive, celle que Von pourrait raison- 
nablement esperer voir se manifester si la correlation tendait vers la 
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perfection. Imaginons en effet une suite de distributions k deux varia¬ 
bles k distributions marginales invariables et a correlation croissante. 
On verra alors les lignes de regression se rapprocher Tune de Tautre et 
tendre, en meme temps que l’axe principal vers une position limite qui 
est precisemment celle de la droite (D) qui, elle, reste fixe. 7 C’est 
cette permanence qui, avec sa construction sym6trique par rapport aux 
deux variables, lui donne pour nous un interet particulier, qu’accroit 
encore la facilite de son calcul. 

II importe cependant de preciser que ce calcul n’est valable qu’au- 
tant que le lot etudie constitue un echantillon repr4sentatif de la popu¬ 
lation. II cesserait de pouvoir etre utilise si les individus mesures 
avaient 4t4 s41ectionnes d ? apr4s les valeurs de Tune des variables. La 
seule m4thode recommandable dans ce cas est la metbode des moindres 
carres classique. 

Un avantage important de la methode r4guliere lorsqu’elle peut etre 
employee et que Ton etudie simultanement plusieurs organes, est de per- 
mettre de donner en une fois toutes les relations d’allometrie qui peu- 
vent exister entre toutes les variables prises deux a deux. 11 suffit pour 
cela d’ecrire les equations sous leur forme symetrique : 

X-X Y-YZ-Z _ 

crx cty <JZ 

Toutes les relations chercMes se trouvent ainsi representees, si Ton 
compare p organes, par une seule droite de Tespace a p dimensions, 
droite dont les diverses projections represented les relations des or¬ 
ganes pris deux a deux, trois a trois, etc. ... 


*La pente de la droite (D) est intermediate entre celle de l’axe principal et celle de 
la bissectrice des lignes de regression. D&s que la correlation est forte ces trois droites 
sont trSs voisines. Avec r « 0.90 et a = 0.900 la pente de l’axe principal est 0.889 et celle 
de la bissectrice 0.9005 ; avec r - 0.95 et a - 0.900, les pentes correspondantes sont 0.895 et 
0.9001. La moyenne des pentes des deux droites de regression donne elle-mSme une tr£s 
bonne approximation respectivement 0.905 et 0.901. Dans un cas comme celui des Mala 
ou toutes les correlations depassent 0.95, toutes ces droftes se confondent pratiquement, 
les erreurs d’dncbantillonnage pouvant depasser plusieurs milliAmes, En intervertissant le 
rdle des variables dans l’exemple precedent, on verrait que les conclusions valables pour 
l’allometrie a *0.900 le sont egalement pour 1’allometrie inverse a = 1.111. 

II est facile de voir que lorsqu’il y a isomdtrie (a = 1), axe principal, bissectrices des 
lignes de regression et droite d’allometrie se confondent 

Nous obtiendrions la meme superposition pour une valeur de a quelconque A la condi¬ 
tion de faire usage de coordonnSes normalises. On peut done dire, en un certain sens, 
que la solution proposde ici se confond avec celle qu’a pr£eonis£e Pearson, sans que la 
justification qu’il en avait donnee ait jamais paru pleinement satisfaisante. II importe 
de noter A ce propos que le principe de minimum dont nous avons fait usage et qui nous 
parait constituer la meilleure justification de la droite d’allomdtrie, ne semble avoir dtd 
invoqud, ni par Pearson, ni par aucun de ceux qui ont dtudid plus recemment le m£me 
probleme. 
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Sans vouloir d4velopper ici cette extension des r4sultats pr4e4dem- 
ment obtenus, il est int4ressant de noter que, si Ton prend en consid4ra- 
tion l’ensemble des p organes etudi4s, on peut, comme nous l’avons 
fait pour deux organes seulement, faire correspondre a chaque indi- 
vidu r4el, X, Y, Z f ... un individu normal caraeteris4 par des mesures 
X\, Y\, Z\, . . . telles que: 

... .L\XizI + hzl+hzl + 

&X <T7 OZ PL a X OT VZ 

Tous les ealculs pr4e4dents ont ete faits sur les logarithmes et cette 
proe4dure est le seule qui soit recommandable. Des formules connues 
permettent d’en deduire les constantes qui earact4risent la distribution 
des mesures elles-memes. Dans la pratique les formules ei-dessous 
donnent une approximation toujours suffisante. 

En appelant y\ le nombre dont le logaritbme est egal.a la valeur 
armoyenne de log y , on a approximativement: 

y' = e Y y = y' (l + (ry 2 /2) 

<Ty * y'crY (1 + 3a//4) Y y = or (1 + cry 2 /4) 

or ne peut guere depasser 0.2, ce qui correspond d4ja a un rapport de 1 
a 4 £ntre les dimensions du variant pour les individus de taille extreme. 
On voit que Ton peut ecrire a moins d ’un centime pres: 

Yy = cr v cry ~ y'cr v 

approximation toujours suffisante puisque l’erreur d’enchantillonnage 
commise sur le determination de Tun ou l’autre de ces parametres peut 
d4passer largement le centieme tant que 1’effectif de l’4chantillon est 
inf4rieur a quelques milliers. On aurait inversement, si les ealculs 
avaient 4t4 faits sur les mesures memes: 

T-log0-Y,V2 cry = V y (l-Vy 2 /2) 

ou pratiquement : 

cry = Y y 

Comme nous devions nous y attendre, la constante d’equilibre est le 
quotient des coefficients de variability des deux grandeurs compares. 
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III 

Nous pouvons maintenant reprendre uotre etude du probleme bio- 
logique de Pallom4trie. 

Nous avons vu, des le debut de notre expose, que Pallom4trie de 
taille 4tait tres 4troitement li4e a la variabilite et nous venous de ra- 
mener k un seul ees deux problemes. D4finir la constante d’equilibre 
a par le quotient des deux coefficients de variabilit4 revient a dire 
qu’il y a allometrie des que les grandeurs que Ton compare sont in- 
4galement variables et que risom4trie ne peut exister que si les deux 
variabilites sont les memes. Nous avons interpr4te cette variabilit4 
d’une population adulte par le jeu d’innombrables causes dont chaeune 
a pour effet d’augmenter ou de diminuer d’un pourcentage tres 
minime les dimensions d’un organe ou d’un autre. On sait, par ail- 
leurs, et nous allons avoir Poccasion de le pr4ciser, qu’il y aura corre¬ 
lation entre deux organes si certains faeteurs agissent simultan4ment 
sur Tun et sur Pautre, la correlation etant d’autant plus complete que 
le nombre des faeteurs est plus grand et leur action individuelle plus 
importante. II est clair enfin que rallom4trie ne serait parfaite que 
si la corr41ation P4tait aussi, les variances liees ne devenant nulles qu’a 
cette occasion. Plus g4n4ralement, la relation d’allom4trie sera d’au- 
tant plus exactement suivie que la correlation entre les deux grandeurs 
comparees sera plus rigide et qu’il y aura moins de causes faisant varier 
un des organes independamment de l’autre. 

II serait par cons4quent tout a fait d4raisonnable de cbercber une 
explication de l’aUometrie de deux organes dans le jeu de certains fac- 
teurs qui agiraient sur Pun des organes et non pas sur Pautre, Pinter- 
vention de tels meeanismes ne pouvant, au contraire, que troubler la 
puret4 du ph4nomene. S ? il existe des faeteurs de croissance agissant 
plus ou moins electivement sur telle ou telle partie de corps, leur action 
ne devra etre invoqu4e que pour justifier Pexistence d’4carts plus ou 
moins importants entre les previsions permises par la relation d’allo- 
m4trie et la realite, Pallom4trie meme s’expliquant entilrement par 
Pinegale sensibilit4 des deux variants au meme ensemble de faeteurs. 
Si, par exemple, la longueur de propodite de la pince des Maia males 
compar4e a la longueur du eephalothorax a une constante d’4quilibre 
de 1.93, e'est qu’en moyenne le propodite de la pince des Maia est 1.93 
fois plus sensible que le c4phalothorax a toutes les actions qui peuvent 
s’exercer au cours du d4veloppement en le favorisant, ou le d4favori- 
sant. 

II est faeile de preciser quantitativement ces indications en raison- 
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nant sur un modele extremement simplifie, mais cependant fort in- 
struetif. Imaginons le cas idealement simple ou deux organes com¬ 
pares reagiraient aux memes causes independantes de variation et a 
aucune autre, et ou, pour chacune d’elles, l’effet serait proportionnel k 
1 ’intensity le facteur de proportionnalite etant le meme pour tous les 
facteurs et different pour les deux organes. Adoptons pour chacun 
des facteurs un systeme de mesures tel que sa valeur moyenne soit nulle. 
Nous pourrons alors 4crire, u x , u*,us ... 9 etant les facteurs de varia¬ 
tion supposes tres nombreux: 

-5T — JST 4 A u 1 4 AW '2 4 Aw 3 4. . . 4 A Vfn 


~Y — "Y 4 flV^i 4 I1U2 4- fJ*U 3 4 . . . 4* fjAln 


Nous savons que ces deux distributions sont normales ou approxima- 
tivement normales et nous avons: 

<Tx 2 = >?{d u 2 4 cr u3 2 4 du 2 4 ... 4 <r Un 2 ) 

C Ty 2 = /X 2 (o- Ul 2 ‘4 <Tu 2 4 cr u 2 4 ... 4 d u 2 ) 

La constante d’equilibre sera alors a = p/\, quotient des facteurs de 
proportionnalite. Le coefficient de correlation sera egal a l’unite ear: 

kjl{d u 2 4 d u 2 4 <r tt 2 4 . . . 4 d un ) ^ 

T m --- = X 

d x dy 


Dans ce cas schematique, allometrie et correlation seront simul- 
tanement parfaites, parce que toute la variability est conditional dans 
les deux organes par les memes causes. 

Imaginons maintenant qu ’on changement dans les conditions 
d’existence de I’esp^ce, ou un changement gen4tique, fasse disparaitre 
un groupe de causes de variability, mais leur en substitue deux autres, 
ind4pendants, qui exerceraient des actions de meme nature et de meme 
amplitude que les facteurs disparus, le premier agissant 41ectivement 
sur Tun des organes, le second agissant electivement sur 1 J autre. Par 
hypothese, les variances de X et Y restent les memes, mais le coefficient 
de correlation diminuera, les nouveaux facteurs n’intervenant pas dans 
son num4rateur, tandis que son denominateur reste inchange. Si 
v u v*, Vs , Wi, w 2 , w 3 . . . sont ces facteurs, qui exercent re- 
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speetivement, par hypothese, des actions \v 1} kv 2 , kv 3 . . . sur X et 
jxW-L, fiWz, pw 3 . . . sur Y, on trouve: 

A. 2 (ff« 1 2 + <xv, 2 + ■ ■ .) ^(ffio^ + tr^ 2 + . . .) 

r = 1 - z -= 1 - 1 - 

&X 2 &Y 2 

et: 


2(1 -r) 


A 2 (crtjj^ 2 -f- 0"t7 2 2 -f- . . . 4- o’tCj 2 4- <r«?2 2 • * •) 
crx 2 

/a 2 i&Vj 2 4* <Jv 2 4" • * • 4- CTi0 x 2 4- ctioq 2 . * •) 


On verifie aisement que les num£rateurs de ees deux dernieres 
expressions represented les valeurs moyennes des variances liees de 
X et de Y ce qui nous redonne les expressions connues de ces grandeurs, 
2(1 ~ r)ax 2 et 2(1 ~r)o- F 2 . 

On voit que, dans cet exemple, allometrie et correlation sont inde¬ 
pendant es 1 ’une de 1 ’autre. L ’allometrie reste constante parce que, par 
hypothese, l’effet resultant de 1’ensemble des facteurs reste le meme, 
lorsque certains d’entre eux sont remplaees par d’autres; le coefficient 
de correlation change avec le nombre des facteurs communs aux deux 
organes. Lorsque le nombre des facteurs communs diminue, la dis¬ 
persion des points figuratifs autour de la droite d’allometrie augmente, 
ce que traduit raccroissement des variances liees de X et de Y; la 
determination de a devient en meme temps plus imprecise, comme le 
montre 1 ’expression de sa variance, a 2 (l-r 2 )/N. La valeur de la 
constante d’equilibre a est ici egale a p/X, c’est a dire au quotient des 
deux coefficients qui mesurent la sensibility respective des deux organes 
aux actions perturbatrices. Nous voyons ainsi apparaitre, avec une 
nettete toute particuliere sur ce schema simplify & 1’extreme, la carac- 
teristique essentielle du phenom^ne d ’allometrie que nous indiquions 
tout & l’heure & propos d’un cas concret: 1’allometrie de taille exprime 
l’in4gale variabilite de deux organes, temoignage elle-meme de leur 
in£gale sensibility a 1’ensemble des facteurs perturbateurs de la crois- 
sance. 

Nous pourrions ais4ment compliquer le schema pr6c4dent et nous 
rapprocher ainsi progressivement de la r4alite. On congoit sans qu’il 
soit besoin d’explication, qu’ii pent arriver, qu’il doit meme arriver, 
que les different facteurs aient une importance inegale, que chacun 
d’eux ait ses coefficients d’action propres. Les resultats ehangeront 
naturellement, mais l’essentiel en subsist era. L ’allometrie restera 
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imputable k l’ensemble des facteurs de la variability, la correlation ne 
dependra toujours que des seuls facteurs communs aux deux organes 
compares; la variability r4siduelle, que mesure la variance li4e et qui 
est responsable des 6carts des points observ4s et de la droite figura¬ 
tive, restera imputable aux facteurs qui agissent sur Tun ou sur l’autre 
des deux organes, mais non pas sur les deux. 

Si notre etude se limite a la comparaison de deux organes, nous ne 
pouvons aller plus loin; mais si nous etudions plusieurs organes du 
meme animal nous pouvons eonstater que chaque organe a sa varia¬ 
bility propre et que, suivant les couples d’organes compares, la correla¬ 
tion est plus ou moins forte. On pent alors se demander si, en dehors 
des facteurs generaux qui agiraient sur toutes les parties de l’organisme 
et expliqueraient qu’entre deux organes quelconques existe toujours 
un certain degre de correlation, n’existeraient pas d’autres facteurs 
agissant de facon plus ou moins elective sur la eroissance de telle ou 
telle partie de 1’animal. Ceux-ci pourraient d’ailleurs appartenir a 
plusieurs categories, les uns etendant leur action a tout un systdme 
d’organes, les autres n’exergant leur influence que sur une partie plus 
limitee du corps. Une telle etude, sans apporter aueun element 
nouveau a la eonnaissance de l’allomytrie meme, permettrait de deeeler 
certaines influences qui I’empechent de se manifester dans tous les cas 
avec la meme rigueur. On peut esperer d’ailleurs qu’elle pourrait 
apporter quelque lumiere sur des faits qui n’auraient meme pas et4 
soup$onnys sans elle. 8 

Pose dans ces termes, le probleme est typiquement du ressort de 
l’analyse factorielle. Cette technique statistique, actuellement d’un 
emploi constant en psychologie, n’a encore etc utilisee que trds rare- 
ment en biometrie bien que, des 1932, S. Wright en ait montre l’in- 
teret. H faut esperer que les methodes de calcul mises au point recem- 
ment par Delaporte en generaliseront l’emploi. J’en ai moi-m§me 
fait usage dans un travail consacr4 a 1’etude des variants sexuels d’un 
Crustace et j’indiquerai ei-apres les principes qui m’ont guide dans 
cette etude. Ils ne sont d’ailleurs que la generalisation de ceux que 
nous avons dej& mis en oeuvre. 

8 Je citerai k titre d’exemple, le probleme de la taille cellulaire. Dans un travail cons- 
sacrS k l’4tude des facteurs qui la conditionnent chez les Mammif&res (C. R. Soc. Biol., 
T. 135, 1941, p. 662, 750 et 1309), j’ai pu montrer en premier lieu que/d'une facon tr6s 
g£n£rale, les dimensions du noyau sont unies k celles de la cellule par une relation de la 
forme 27 ■ &C* ou b est une constante caract£ristique du type cellulaire et ou a est volsin 
de 0.6 ou 0.7. J’ai 6tabU ensuite que la taille moyenne d’une cellule depend de deux fac¬ 
teurs, 1’un gSndral qui a des valeurs approximativement <§gales pour des animaux de 
m9me taille, quelle que soit leur esp£ce, 1’autre special, qui ne depend que de la nature de 
l’animal et non pas de sa grandeur. 
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Imaginons que nous ayons etudie un certain nombre d’organes et 
que les facteurs qui agissent sur eux se r4partissent en un certain nom¬ 
bre de groupes independants entre eux ui 9 U 2 , . . . u \ 9 uf 2 , • • • 
v>” 2) .... Pour plus de commodite nous designerons chaque groupe 
par une seule lettre, de sorte que nous pourrons ecrire pour chaque 
organe: 

Y i = Y 1 + XxU + A'iZ7' + k"JJ" + . . . 

un on plusieurs des coefficients A*, A'i, k" x , . . . pouvant etre nuls. 
Le nombre des groupes de facteurs n’est pas d4fini a priori et peut 
varier avec le nombre d’organes etudies, deux groupes pouvant agir 
simultanement sur plusieurs organes et separement sur certains autres. 

On aura alors: 

(Tj 2 = ki 2 (TV 2 + k'i 2 Gjj / 2 + k" i 2 axj/r 2 + 

la fraction de la variance totale attribuable a V est n 2 = Apo-u 2 /on 2 et ti 
represente le coefficient de correlation r lf7 de Y x et de Z7. On montre 
sans difficult# que le coefficient de correlation entre deux organes Y P et 
Y g est: 

Tpg - r pjjT 0 XJ 4- TpjjrT g jj, + TpU/rT g jjn -f . . . 

Les coefficients de correlation r rg pouvant etre mesures directement 
il s’agit de calculer a partir d’eux les coefficients r pU , r pU , . . . r gU , 
fgxjr II n’existe pas de methode generate de resolution d’un tel systeme 
d’equations et l’on doit proceder a une serie de tatonnements method- 
iques, en essayant successivement des schemas a 1, 2, 3 . . facteurs, 
jusqu’a ce que Ton arrive a retrouver les r pU avec les r pg . 

L’application de cette methode 9 m’a permis de reconnaitre chez les 
Maia un facteur general qui conditionne la taille globale, 4valu4e par les 
dimensions du cephalothorax, dont la mesure L est la meilleuer esti¬ 
mation et qui contrdle 92 a 94% de la variance totale des appendices; 
un facteur de groupe G qui agit sur 1’ensemble des appendices et 
controle 3 a 6% de leur variance; un facteur regional B qui agit 
electivement sur les appendices anterieurs dont il controle 3% environ 
de la variance. La variance r4siduelle, qui est de l’ordre de 1% de la 
variance totale, est imputable a des facteurs locaux et ne peut etre 
evaluee que par diff4rence. 

Il est possible & partir de l’ensemble des mesures des appendices de 
calculer les valeurs num4riques de G et de B relatives a chaque individu, 

• Voir Bictypoloffie, T. 6, pp. 73-98, 1938. 
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et d’4tablir, d’autre part, des formules de prevision plus completes que 
la relation d’allometrie, en ce qu’elles font intervenir simultanement 
L, G, E . On constate alors qu’il est possible de recalculer, k partir de 
cet ensemble de trois nombres, les dimensions de ehaque individu avec 
une precision sup4rieure au eentieme, les coefficients de variabilite 
residuelle etant tous inferieurs au dixieme des coefficients de variabilite 
initiaux. 

On ne pent pas prevoir, tant que d’autres recberches sur ce probleme 
n’auront pas 4t4 faites, quel peutetre l’avenir de~ce genre d’4tude. 
II n’est pas interdit d’esp4rer qu’elles pourront apporter quelques 
lumieres sur les ou la relation d’allometrie se verifie imparfaitemenfc, 
soit que le nuage de points figuratifs soit trop 4tale, soit qu’il s’incurve 
a l’une de ses extr4mit4s. 

IV 

II s’agit de montrer maintenant que le probleme de Fallom4trie de 
croissance n’est pas difE4rent au fond de celui de Fallom4trie de taille. 

Nous remarquerons en premier lieu que rien de ce qui a 4t4 dit plus 
haut de l’allometrie de taille n’implique que les animaux compares 
aient d4finitivement cesse de croitre, mais que nous avons seulement 
suppos4 que la population 4tudi4e ne comportait que des individus 
ayant atteint le meme stade de d4veloppement. Nos conclusions vau- 
draient, par exemple, tout autant pour chacun des stades larvaires d’un 
Insecte que pour son stade imaginal. C ’est a partir de cette tr4s simple 
remarque que nous allons pouvoir eonstruire la d4monstration cbercb4e 
que nous etablirons tout d’abord sur le developpement d’un Artbro- 
pode, la discontinuit4 de la croissance dans cet embranchement rendant 
les faits particuli4rement clairs. 

Considerons les stades successifs du d4veloppement d’un de ces 
animaux, un Crustac4 par exemple, en les ehoisissant dans une meme 
phase de la croissance, c’est k dire dans une p4riode de l’existence ou 
ne se produit aucun remaniement organique, ni aucun changement 
notable dans le fonctionnement physiologique des differents appareils. 
Supposons, plus precis4ment, que nous sommes dans les conditions ou 
une 4tude de la croissance relative a un sens, c’est £i dire que la forme 
d’un individu ne depend que de sa taille. II peut arriver que la 
population de chacun de ces stades soit assez peu variable et que le 
taux d’accroissement lors de la mue soit assez grand pour que les 
repr4sentants de deux stades cons4cutifs aient des dimensions qui ne 
se chevauchent pas. En general cependant les individus les plus grands 
du premier stade ont des dimensions sup4rieures a celles des individus 
les plus petits du second stade. Autrement dit, les nuages figuratifs 



42 


BIOMETRICS MARCH 1948 



Schema correspondant a eelui de la fig. 4, dans 1 ’hypothese oil la population 
comporte en nombre 6gal les repr4sentants de deux stades cons4cutifs du d4velop- 
pement. Voir le texte. 


des deux populations de mesures de x et de y se recouvrent partielle- 
ment. Comme nous avons suppose expressement que dans l’espdce 
etudi^e la forme ne depend pas directement de 1’age, les deux series de 
mesures de la zone de recouvrement doivent necessairement corre¬ 
sponds a la meme loi de eroissance et donner des resultats aussi con- 
cordants que si elles avaient ete pratiquees dans la meme zone sur deux 
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populations appartenant k un meme stade. II resulte de la que, si nne 
eourbe puissance traduit dans cette zone pour la premiere population 
la relation existant entre les deux variants, elle doit la traduire tout 
aussi bien pour la deuxieme. Les relations d J allom4trie / correspondant 
k la premiere population et f 2 correspondant a la deuxieme, qui coinci¬ 
dent dans la zone de chevaucbement, coincideront alors dans toute leur 
etendue. II en serait de meme pour tous les stades qui precedent ou 
qui suivent les deux stades consideres jusqu’a present, a la seule condi¬ 
tion que dans toute cette etape du developpement, on soit en droit de 
considerer la forme comme fonetion de la taille. La loi de croissance 
f sera en ravanche remplacee par une loi diff^rente <£ lorsque le develop¬ 
pement sera arriv4 a un stade tel que, dans la zone de recouvrement, 
les deux populations doivent etre eonsid4r4es comme diff4rentes. 

Le probleme de la recherche d’une loi de croissance relative valable 
pour une des etapes du d4veloppement d’un Arthropode se ramene 
ainsi, dans les hypotheses ou nous sommes places, a celui de la recherche 
de la relation existant entre deux organes pour les animaux d’un meme 
stade ehoisi arbitrairement dans cette etape. Ainsi s’explique, de la 
fagon la plus naturelle, 1 ’identit4 formelle des relations qui permettent 
de comparer les formes successives realis4es au eours de developpement, 
ou les formes diverses que peuvent presenter suivant leur taille les 
animaux d’un meme stade. Ainsi s’explique aussi tres simplement, le 
fait que le passage d’une etape k la suivante se marque sur la eourbe de 
croissance relative par un point anguleux ou une discontinuit4, 
puisqu’il n’y a aucune raison pour que les courbes / et 4 > se raccordent 
tangentiellement 1 ’une a 1 ’autre. 

L’414ment nouveau que nous venous d’introduire dans la discussion 
va nous permettre de donner au probl4me de la d4termination des con- 
stantes de la eourbe une autre pr4sentation que celle que nous con- 
naissons deja. II faut que le proc4d4 de caleul de ces constantes soit 
tel qu’il donne le meme r4sultat, lorsqu’il porte sur les mesures corre¬ 
spondant a un meme stade, et lorsqu’il est applique a deux ou a 
plusieurs stades successes. Pour nous placer dans un cadre plus con- 
cret, nous adopterons un mode de transcription graphique des resultats 
qui rende sensiblement normal le nuage des points figuratifs de la 
population des mesures relatives a un stade donn6. Nous savons que la 
resultat sera atteint fr4quemment en utilisant les coordonnees log- 
arithmiques; parfois pourront etre employ4es tout aussi bien des 
coordonnees arithm4tiques. La correlation entre les deux variants 
etant forte par hypothese, le nuage sera tres allong4 suivant une droite 
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sur laqueile s’echrlonneront aussi, d’apres ce qui vient d’etre dit, les 
nuages figuratifs des autres stades. La definition des param4tres de 
cette droite doit etre telle que le resultat obtenn soit le meme, que l’on 
ait utilise un seul des nuages de points, plusieurs d’entre eux, ou leur 
ensemble. Un calcul tres simple montre que cette condition suffit k 
exclure les droites de regression, que nous savions deja ne repondre 
que tres imparfaitement au probleme pose, et, au moins lorsque les 
coordonn4es sont logarithmiques, 1’axe principal qui, foumissant une 
solution symetrique en X et Y, aurait pu a la rigueuer etre retenu. 11 est 
clair, en revanche, que si le rapport des variances de Y et X a meme 
vaieur pour toutes les populations, ce que nous avons suppose, en ad- 
mettant que Tallometre est la meme pour deux stades consecutifs, le 
resultat cherche sera atteint en prenant pour pente de la droite le quo¬ 
tient des eearts types des variables comparees qui, si les nuages de points 
sont reellement alignes, est le meme pour chacun d’eux et pour le nuage 
global. Nous retrouverons ainsi, par une toute autre voie, le resultat 
auquel nous etions arrive en etudiant Tallometrie de taille. 

Ainsi se trouve etablie, pour un Arthropode, la proposition an- 
nonc4e, mais avant de poursuivre, il importe de preciser que les con¬ 
clusions auxquelles nous sommes arrives offrent un degre assez inegal 
de generalite. 

Nous avons demontre qu’a l’interieur d’une etape de la eroissance 
une meme loi / convient a la fois k l’allometrie de taille pour chacune 
des stades et a rallometrie de eroissance pour l’ensemble de ces der- 
niers. Nous avons demontre accessoirement que deux etapes suceessives 
sont normalement separees sur la courbe figurative du phenomene par 
un point anguleux ou une discontinuity, marques du passage d’une loi 
de eroissance k une autre. Mais nous n’avons pas demontre qu’il 
existait de telles 4tapes et ne pouvions le demontrer puisqu’il s’agit la 
d’une question de fait. L’observation montre d’ailleurs que s’il existe 
de nombreux cas oil une etape eomporte cinq ou six stades ou meme 
davantage, il en est d’autres, tels que les d4veloppements postpub4raux 
de certains Crabes, ou chaque stade doit etre consider4 comme consti- 
tuant a lui seul une etape distincte. La courbe totale peut ainsi corn- 
porter k son extr4mit4 un ou plusieurs segments ne representant qu’une 
allometrie de taille, alors que les prec4dents descrivent a la fois une 
aUometrie de eroissance et une allom4trie de taille. C’est le cas pour 
Maia squinado, cu les eourbes traduisant le d4veloppement des ap¬ 
pendices dans les deux sexes, ou celui de l’abdomen des femelles, com- 



LA RELATION D*ALLOM£tRIE 


45 


portent dans la partie actuellement connue, trois segments, dont le der¬ 
nier, separe du precedent par une discontinuity, se rapporte exclusive- 
ment a l’allometrie de taille de l’adulte. 

Plusieurs hypotheses supplementaires ont ete n4cessaires pour que 
puisse etre precise la forme de la relation f(z, y) *0. Nous avons 
suppose que nous disposions d’un mode de transcription de nos re- 
sultats qui rendait sensiblement normal, k chaque stade, le nuage de nos 
points figuratifs. La demonstration precedente implique bien que, si 
r ce resultat peut etre atteint pour Pun des stades, il Pest necessairement 



Allom4trie de 1’abdomen chez les Maia squinados femelle, h deus stades 
eonsdcutifs du developpement (a = 1,28). 
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pour tous ceux qui appartiennent a la meme etape, puisque la relation 
qui unit les deux variables est unique et que devenue lineaire dans une 
de ses parties, elle le devient necessairement dans les autres. Mais elle 
ne peut pas nous dire quel est le type d’anamorphose qui convient a 
chaque eas. S’il est vrai que le resultat peut etre souvent atteint en 
utilisant des eoordonn4es logarithmiques, et si nous avons de bonnes 
raisons de croire qu’il en est tres generalement ainsi, nous eonnaissons 
quelques cas ou il en va autrement. Ce sont ceux qui sont connus sous 
le nom d’allom4trie variable et ou la courbe logarithmique presente 
une eonvavite plus ou moins marquee ou meme un point d’inflexion. 
Ils s’observent surtout, semble-t-il jusqu’a present, sur les trongons de 
la courbe totale qui se rapportent aux adultes et represented essen- 
tiellement des allometries de taille. Mais il n’est pas exclu que l’on 
puisse trouver quelque jour des allometries de croissance du meme 
type. 

Enfin nous rappellerons que le choix de la “meilleure courbe’’ repre¬ 
sentative d’un ensemble de mesures comporte toujours une large part 
d’arbitraire et que la solution a laquelle nous sommes arrives, pour 
satisfaisante qu’elle puisse etre, n’est pas la seule que l’on soit en droit 
de retenir. Nous preeiserons egalement que eette solution est essen- 
tiellement tbeorique, qu’elle n’est valable que lorsque sont remplies au 
prealable toute une serie de conditions et qu’au surplus elle ne pretend, 
en aucune maniere, a etre une methode reguliere de calcul des eons- 
stantes d’une courbe d’interpolation. 

Les remarques et les reserves qui viennent d’etre faites, pour utiles 
qu’elles puissent etre, ne doivent cependant pas etre considerees comme 
diminuant la portee de notre demonstration. Elies en precisent seule- 
ment la signification et rappellent, s’il en 4tait besoin, qu’un schema 
th4orique ne doit pas etre appliqu6 k un cas concret sans d’indispen- 
sables precautions. H nous reste a montrer rapidement que les re- 
sultats precedents s’etendent aux organismes dont la croissance ne se 
fait pas par mues. Si nous avons choisi, pour notre d4monstration, le 
cas des Arthropodes, c’est parce que, dans cet* embranchement, les 
divers stades d4finissent clairement une succession de populations sur 
ehacune desquelles il est possible d’etudier les phenom4nes d’allom4trie 
de taille. Mais il est clair qu’une telle etude peut egalement etre faite 
sur des animaux k croissance continue, si l’on possede un eritfere per- 
mettant de definir sans ambiguite les individus comparables, critere 
qui 4videmment pourra la cas ech4ant etre l’age. Les raisonnements 
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precedents restent valables, sons la reserve habituelle que, dans la pe- 
riode etudi4e n’intervienne pas de pMnomene physiologique capable de 
donner k deux animaux de meme taille des structures dissemblables. 
Pour le reste, la difference interessante entre la cas general et eelui des 
Arthropodes porte sur la definition des stades successifs qui, au lieu 
d’etre impose par la nature meme du developpement, peut devenir 
assez largement arbitraire. On peut, en principe, multiplier ces stades 
autant qu’on le desire, sans que rien soit change a nos raisonnements, 
ce qui permet d’etendre la demonstration du cas des animaux k de- 
veloppement discontinu k eelui des animaux a croissance continue. 

Quelques lignes nous suffiront pour etendre a la comparaison bio- 
chimique des organismes les procedes employes dans la comparaison 
morphologique. Rien, dans les formules que nous avons utilisees, 
n’implique que x ou y doivent mesurer plutot la longueur d’un appen- 
dice que le poids du eerveau ou la quantity de calcium renfermee dans 
un organisme. Toutes les interpretations valables dans un cas le sont 
aussi dans Tautre et la notion d’allometrie biochimique, qui a 6te vive- 
ment eritiquee, mais qui a aussi ete employee avec un grand suceSs, 
sans jamais pouvoir etre expliquee de fagon satisfaisante, apparait 
maintenant eomme pleinement justifiee. 


Les nombreuses recherches de morphologie quantitative de ces 
vingt dernieres annees ont permis de decouvrir un grande nombre de 
faits nouveaux et de donner une forme precise a des notions dont les 
travaux anterieurs avaient pu faire pressentir Timportance. Ces re- 
cherches n’ont mis le plus souvent en oeuvre que des mSthodes tr£s 
elementaires et un appareil mathematique tres reduit. 

La simplicity de ce mode d’etude a beaueoup facility sa diffusion; il 
a permis d’acquyrir en assez peu de temps des renseignements preeieux 
sur la croissance d’animaux tres divers et d’ytablir des comparisons 
precises entre adultes de meme esp£ce ou d’especes voisines. II est 
certain que des recherches de ce genre conservent k l’heure actuelle 
une bonne part de leur utility et que, longtemps encore, elles appor- 
teront, dans certains domaines, des contributions tres appreciables a 
nos connaissances. 

Mais, dans un nombre croissant de circonstances, il apparait de plus 
en plus clairement que des conclusions legitimes ne pourraient etre 
tirees de telles etudes qu’a la condition de disposer de crit^res plus 
surs que ceux gue permet le simple ajustement graphique. Aussi 
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a-t-on ete conduit a joindre k la technique rapide et facile, mais seule- 
ment approximative, de Pallom4trie, les techniques plus laborieuses et 
plus savantes, mais aussi plus rigoureuses, de la biometrie classique. De 
plus en plus frequemment est apparue, au cote des constantes de la 
formule d’aUometrie, 1’indication de leurs hearts type. Ce faisant, on 
se contente d’ailleurs de juxtaposer Temploi de deux proc4d4s, Tun 
d’origine biologique, 1’autre d’origine mathematique, sans les fondre 
en une technique coherente. 

La notion meme d’allom^trie, malgre toutes les tentatives qui ont 
4te faites pour en penetrer la signification biologique, n’est encore 
aujourd’hui guere plus qu ’une simple donnee de 1’experience, dont le 
fondement rationnel reste mal assure, et que beaucoup de ses partisans 
considerent toujours comme essentiellement empirique. La diversity 
meme des circonstances dans lesquelles il parait permis d’en faire usage 
en rend la comprehension plus obscure et la contradiction s’aggrave 
entre Pimportanee grandissante des resultats obtenus et la precarite 
de plus en plus apparente de leur fondement. Pour surmonter cette 
contradiction, il est devenu necessaire de repenser la notion meme 
d’allomStrie. 

C’est ce que nous avons essaye de faire dans les pages qui precedent. 
Eliminant les difScultes accessoires, negligeant provisoirement tous les 
phenomenes qui, plus ou moms legitimement, apparaissent en disaccord 
avec la simplicity de la relation fondamentale, nous avons essaye de 
comprendre la signification veritable de la relation d’allometrie. Il 
m’apparait que nous y avons a peu pres reussi et qu’elle n’est rien 
d’autre qu’un aspect de la variability essentielle de tous les etres 
vivants. 


DISCUSSION 

Lois M. Zucker. I should like to discuss some data we have which 
fit in rather well with a question Dr. Teissier has raised. We have data 
on the relations between the femur ash and body weight in growing rats, 
selected in five age groups. Each age group provides a scatter diagram 
relating log-ash to log-weight and these scatter diagrams overlap as in 
Dr. Teissier’s example. Now, does the same line or the same slope 
apply both within ages and between ages ? In other words is the rela¬ 
tion independent of age ? The choice of line to use in fitting the data 
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becomes rather critical. Neither variable is independent and there 
is error in both variables. Therefore the preferred line may fall be¬ 
tween the two regression lines. If the deviations parallel to log-ash are 
minimized the five slopes do not differ significantly from a common 
trend which, in turn, is very much flatter than the trend between ages; 
if the deviations parallel to log-weight are minimized, the five slopes 
have a common trend which does not differ significantly from the trend 
between ages. 

The line suggested by Dr. Teissier lies somewhere between. But is 
it the best line? It seems to us that it has a flaw. Consider the ideal 
line expressing the relationship. Suppose there are only, say, ten pos¬ 
sible type positions along this line instead of a continuous set. Neither 
variable is independent. Actual points will range themselves around 
their type position in elliptical distributions which, accordingly to the 
error distribution, may be nearly circular (each variable with equal 
error) or tall and narrow (most of the error with the Y coordinate) or 
low and wide (most of the error with X). In the last case the regres¬ 
sion of X on Y is a good estimate of the true line while the regression of 
Y on X, and to a lesser extent lines like the one suggested by Dr. Teissier 
which he approximated midway between the two regression lines, are 
too flat in slope. In the case with most of the error with the Y coordi¬ 
nate, the regression of Y on X is a good estimate, where the regression of 
X on Y, and to a lesser extent Dr. Teissier ’s line, are too steep. It is 
only if the error is approximately the same for both variables that a line 
between the two regression lines, such as that suggested by Dr. Teissier 
is a good estimate. It does not seem possible to accept a single solution 
independent of the error distributions for all these situations, like Dr. 
Teissier’s line or the one suggested by Dr. Wald in a recent publication. 

In response to Dr. Teissier’s suggestion that his line is recommended 
only when neither variable is independent, I should like to state that 
a scatter diagram like that obtained with one variable independent can 
also arise for neither variable independent, if the error is mostly associ¬ 
ated with one variable. It is usually impossible to estimate objectively 
the relative error associated with the two variables in cases like this, 
because the error is largely biological and not error of measurement. 

We believe, for instance, that in the relation between femur ash and 
body weight much more error is associated with body weight than with 
ash. It is very difficult to affect the ash by experimental means—water 
deprivation, starvation, disease, or nutritional deficiencies other than 
of calcium, phosphorus or vitamin D, while the body weight, largely 



50 


BIOMETRICS MARCH 1948 


affected by the weight of water and soft tissues, is very sensitive to ex¬ 
perimental procedure. Should it not, therefore, be much more vari¬ 
able? For this situation we should prefer a line very close to the re¬ 
gression of log-body weight on log-femur ash, not a line halfway to¬ 
wards the other regression line. However, our belief as to the error 
distribution cannot be directly verified objectively. 

(For data see Zucker and Zucker, American Journal of Physiology, 
146, 585, 593,1946; for further discussion see Zucker, Human Biology, 
19:232, 1947.) 


Harold Hotelling. From a purely descriptive standpoint, data on 
a pair of variates such as measures of a bone cannot possibly be sum¬ 
marized adequately by any one regression line. Even if we make the 
simplifying assumption of a bivariate normal distribution, there are 
five parameters, and these cannot be summarized by the two coefficients 
in the equation of a straight line. 

When we pass over to the relation of bone dimensions or chemical 
composition to vitamin D in the diet we are on different ground. If 
the object is simply to predict bone ash percentage as a function of 
vitamin D, then a simple regression equation for bone ash percentage on 
vitamin D quantity is in order, and this equation will disregard all 
measures of the bone excepting ash content. If, however, there is a 
question whether vitamin D has some effect on the bone, without ad¬ 
vance knowledge of the particular kind of effect, and if say eight meas¬ 
ures on each bone are taken to represent it fully, then a more general 
method should be used. This is arithmetically identical with the tech¬ 
nique of fitting a multiple regression equation and testing the sig¬ 
nificance of R. 

The question at issue may be the still broader one of finding some 
combination of vitamins that will affect in some way, not initially 
known, the shape or composition of a bone. Then the appropriate 
method is to determine together the most predictable function of the 
bone dimensions and contents, and the best function of the various 
vitamin doses for predicting this. Techniques for doing this, and for 
testing the significance of the results, have been developed ( Biometrika , 
28:321-377,1936). This kind of approach will, I think, supersede, for 
this kind of purpose, the summarization of the relations between meas¬ 
ures on a bone by any regression line or its -n-dimensional generaliza¬ 
tions. 
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N. Rashevsky, The statistical relations discovered by Professor 
Teissier constitute important new knowledge. This kind of knowledge 
concerns how phenomena happen. If we wish the answer to the ques¬ 
tion of why they happen, we must extend our approach by introducing 
hypotheses and postulates. Professor Teissier is well aware of this 
himself, since he mentioned that he had a physiological explanation of 
some of these phenomena. 

To illustrate what the theoretical method can do with the problem of 
growth, I wish to mention some studies in the Section of Mathematical 
Biophysics at the University of Chicago, relating to the problem of can¬ 
cer. 

Multicellular organisms grow up to a certain limit, whereas in tissue 
cultures cellular multiplication may go on practically indefinitely. 
This suggests that some inhibitory factor is produced by each cell, so 
that as the total number of cells of the organism increases and each cell 
becomes inhibited by a greater and greater number of cells, growth 
eventually stops. The mathematical formulation of such an assump¬ 
tion leads to growth curves reminding us of those described by Professor 
Teissier. As has recently been shown by Kesselman, some aspects of 
allometric growth can be described similarly. The theory implies that 
after the adult stage is reached, the total amount of the inhibitory factor 
decreases due to the repair of the natural wear of the organism and of 
accidental wounds. The theory leads to an expression for the decrease 
of velocity of wound-healing with age. Although the data on this ques¬ 
tion are very meager, whatever is available is in agreement with the 
theory. As the organism grows older and the total inhibitory effect 
decreases, small accidental fluctuations will be sufficient to produce 
accidental growth. Hence such accidental growth would be expected 
to be more frequent in older age. Actually, it is known that the inci¬ 
dence of cancer does increase with age. 

The theory also enables us to derive the incidence curve as a function 
of age and the theoretical curve is found in good agreement with the 
observed one. From one of the parameters of the incidence curve we 
can compute the natural human longevity, longevity as it would be if 
unaffected by illness or accident. The theoretically computed value 
turns out to be of the order of 80 to 100 years. It is certainly remark¬ 
able that from a single theory we may compute such different things as 
decay in wound-healing ability, incidence of cancer and human longev¬ 
ity. But the most important progress in science consists principally 
in finding relations between phenomena which at first sight appear to 
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be unrelated. In cancer research the purely statistical method is of 
extreme importance to give us the data. But for a complete success it 
must be supplemented by a broader theoretical approach. 


Georges Teissier: J’ai suivi avec int4ret les remarques presentees 
a la suite de mon expose, mais si je crois avoir compris le sens g4n4ral 
de ces interventions, je ne suis pas assez assure d’en avoir suffisamment 
saisi les nuances pour pouvoir les discuter en d4tail. 


. A. Mme. Lois Zucker je repondrai seulement qu’il ne me semble 
pas que nous parlions exaetement des memes choses. Les hypotheses 
dans lesquelles elle se place, pour les deux courbes qu’elle 4tudie, diffe¬ 
rent trop des miennes pour que des conclusions vaiables dans le cas 
dont je me suis occup4 puissent s’appliquer sans modification aux 
sch4mas qu’elle a envisages. 


Au Dr. Hotelling, je repondrai que je sais tr4s bien que la ligne 
d , allom4trie, non plus qu’aucune ligne de regression d’aucune esp4ce 
ne peut fournir a elle seule la description complete d’un ensemble de 
mesures. Je n’entends pas substituter a la description statistique 
elassique des faits un proced4 plus simple et aussi effieace, mais bien 
fournir de l’essentiel de ces faits, un schema clair et interpr4table en 
termes biologiques. L’immense litterature consacr4e depuis plus de 
vingt ans a la eroissance relative 4tablit sans conteste que la relation 
d , allom4trie remplit parfaitement ce but. Je me proposais au 
jourd’hui, non pas d’etablir sa validit4 pratique, qui est certaine, mais 
seulement de d4gager sa double signification, biologique et statistique, 
resultat auquel je crois bien etre parvenu. 


A M. Rashevsky je dirai que je suis moi-meme responsable d’un 
sch4ma physiologique qui permet, & ce que je crois, de justifier de faqon 
tres suffisante 1’application de la relation d , allom4trie a un organisms 
en eroissance, et qui explique en meme temps Pexistence des stades 
critiques et des 4tapes du developpement. Mais ce sch4ma ne vaut ni 
pour rallometrie de taille, ni pour Pallometrie biochimique. 
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Comme il serait d4raisonnable de supposer que l’identite des rela¬ 
tions qui interviennent dans les trois cas puisse resulter d’une simple 
coincidence et qne la justification d’une relation applicable en des cir- 
constances aussi dissemblables puisse reposer sur des considerations 
mecanistes, il faut chereher dans une autre voie. Je crois avoir r4ussi a 
montrer qu ’une interpr4tation statistique r4pond entierement a la 
question pos4e. Cela ne signifie naturellement pas qu ’il faille renoncer 
a connaitre les m4canismes physiologiques et pbysicocbimiques qui in- 
terviennent dans chaque cas. Tout en etant d4cid4, pour ma part, a 
continuer l’4tude que j’ai entreprise de cet aspect du probleme, je 
tiens a d4clarer qu*a mon avis, et tenu eompte d’innombrables tenta- 
tives dont celle de Robertson reste le prototype, il est dangereux 
de vouloir fonder une th4orie de la croissance sur un schema physico- 
ehimique n4eessairement trop simple. 



THE GENERAL THEORY OF PRIME-POWER 
LATTICE DESIGNS* 

I. INTRODUCTION AND DESIGNS FOR p» VARIETIES IN 
BLOCKS OF p PLOTS 

Oscar Kempthorne and Walter T. FEDERERf 


I. INTRODUCTION 

Extensive use is now made of the lattice designs originated by Yates 
[5, 6] for testing large numbers of varieties. The general principle in 
the structure of these designs is that the varieties are regarded as aris¬ 
ing from the combination of factors at the same or different levels. 
Some of the effects and interactions between pseudo-factors are con¬ 
founded with blocks or with some other restrictions in each replicate 
of the experiment. 

A number of replicates must be used, confounding, in general, dif¬ 
ferent sets of effects and interactions. Such confounding results in a 
gain in information on unconfounded effects and interactions over what 
would be obtained with randomized complete blocks, since they are 
based on comparisons within smaller blocks of plots. This gain may be 
offset to some extent by a loss in information on effects and interactions 
which are confounded among incomplete blocks. 

Yates [8, 9] and Cochran [1, 2] have described the theory by which 
information contained in block comparisons can be utilized for the 
various designs which had been devised. The purpose of the present 
series of papers is to give a systematic description of lattice trials with 
any number of replicates for a number of varieties which is a power of 
a prime. The present paper contains the basic factorial theory and a 
description of designs for p n varieties in blocks of p plots. It will be 
followed by papers dealing with other lattice designs and numerical 
examples. 


* Contribution of the Statistical Section of the Iowa Agricultural Experiment Sta¬ 
tion in cooperation with the Bureau of Agricultural Economics, United States Depart¬ 
ment of Agriculture. Journal paper no. J 1553. Project 890. 

t Associate Professor, Statistical Laboratory, Iowa State College, and Associate 
Agricultural Statistician, Bureau of Agricultural Economics, collaborating with the Iowa 
Agricultural Experiment Station, respectively. 
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H. BRIEF DESCRIPTION OF THE p” FACTORIAL SYSTEM 

(a) Effects and interactions . A description [3] has been given of 
the p tt factorial system (where p is a prime) with the use of geometrical 
terminology and it is sufficient here to recapitulate the main results. 
The p n treatment combinations may be represented by an 7i-dimensional 
lattice, each side of which contains p points: thus using coordinates 
x% y x 2 , ... ,x„ to represent the levels of each factor in a treatment com¬ 
bination, the control-treatment combination will be represented by 

X\ — x 2 ~. • «— Xfi — 0 5 

the treatment consisting of the first factor at unit level and all the other 
factors at zero level by 

= . 

and the treatment combination in which all factors are at level (p -1) 
by the point 

x x = x 2 = . . . = x n = p -1. 

Using the ordinary definition of effects and interactions as given in 
Yates’ fundamental work, “The Design and Analysis of Factorial Ex¬ 
periments’ ’ [7], the main effect of factor 1 is given by the contrast 
between the yields of treatment combinations represented by 

$i * 0, x 1 « 1 , Xt = 2 , . . . , x 1 m p - 1 , 

and it has (p -1) degrees of freedom. There are in all n such compari¬ 
sons for main effects which may be represented symbolically by x l9 x 2 , 
x$, ... , x n , each having (p-1) degrees of freedom. 

The interaction of two factors represented by x 1 and x 2 has in all 
(p-1) 2 degrees of freedom and these may be split into (p-1) sets of 
(p-1) degrees of freedom represented as follows: 
the contrasts between those treatment combinations for which 


x x + x 2 m 0 , 1 , 2 , . . . , p -1 (mod p), 

the contrasts between those treatment combinations for which 

$i + 2$s*0, 1, 2, . . . , p -1 (mod p), 

and so on to the contrasts between those treatment combinations for 
which 

+ (p - 1 ) x 2 = 0 , 1 , 2 , . . . , p -1 (mod p). 

The formal relationship of the above definitions of interactions with 
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those given by Yates [7] in the case of the 3 rt system is easily seen. 
The four degrees of freedom for the interaction of two factors a and 6 
each at three levels are, in Yates ’ terminology, partitioned into two 
pairs of degrees of freedom, AB(1) and AB(J), each pair being ob¬ 
tained from the comparison of three totals. By reference to Yates’ 
definitions [7], the contrast AB(J) is the same as that denoted, above 
as the contrast between the three totals given by 

Zt + Xz - 0, 1, 2 (mod 3), 

and that AB(I) is the same as that denoted by 

x 1 + 2z 2 « 0 , 1 , 2 (mod 3). 

The three-factor interaction of z lf x 2 and x z has in all (p-1 ) 8 
degrees of freedom and may be split into (p- 1) 2 comparisons of p 
totals, each comparison having p -1 degrees of freedom. In our geo¬ 
metrical terminology they are given by 

x x + a 2 x 2 + a s x 3 = 0 , 1 , 2 , . . . , p -1 (modp) 

where a 2 and a 3 take on all values from 1 to (p- 1 ). Again the formal 
identity with Yates 5 definitions [7] is easily seen for the case of three 
factors each at three levels: in this case, the degrees of freedom for the 
three-factor interaction may be divided into four sets of two degrees of 
freedom, represented by 

x 1 + x 2 + x 3 = 0, 1 , 2 (mod 3), 

+ x 2 + 2x z « 0 , 1 , 2 (mod 3 ), 
x x + 2x 2 + x z m 0 , 1, 2 (mod 3 ), 
x 1 + 2xz + 2x z - 0, 1, 2 (mod 3), 

and these are the comparisons which were denoted by Yates as Z, Y, X, 
W respectively [7]. 

The extensions of the above are obvious and need not be given in 
detail. If the factors are called a, l, c , d etc. then with a simplification 
of the notation the effects and interactions may be represented as 
follows: 

main effects - A , B, C, D, etc. 
two-factor interactions 

AB y AB 2 , . . . , AB*- 1 , 

AG, AC\ . . . , AC P ~\ 

AD, AD*, . . . , AD p -\ 

BC, BC 2 , . . . , BO*" 1 , etc., 
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three-factor interactions 

ABC, ABC 2 , AB 2 C, AB 2 C 2 , . . . , ABC p ~\ 

ABB, ABB 2 , AB 2 D, AB 2 B 2 , . . . , AB*- 1 B*- 1 , 
etc. 

It should be noted that in order to have a unique enumeration, the 
power of the first symbol should be unity. This may be done by noting 
for example that A 2 BC gives the same contrasts as AB 2 C 2 . 

In the 2" factorial system, the concept of generalized interaction 
[7] is of great use: If, for example, interactions ABCB and BCBE are 
confounded then so is their generalized interaction AB, obtained by 
multiplying the symbols together with the rule that 

A 2 = B 2 - C 2 =. . . = 1 

Similarly, the concept of generalized interaction is necessary for the 
discussion of the p n factorial system. It may easily be shown by use 
of the geometrical method described above that if effects or interactions 
represented by X and Y are confounded, then the effects or interactions 
represented by XY , XY 2 , . . . , XY p - 1 must also be confounded. By 
the generalized interaction of two interactions A a B Pl C yi and A a -BP*C y * 
. . . is then meant the (p-1) interactions 

^ a i+ a agj 9 i+£2(771+73 
A a i + 2 + 2 P 1 QV 1 + 27s 


tt i+(P"l) a aj^ffi+(P-l)ft2(77i+(P-l)7a 

where (i) the powers are all to be reduced modulo p (that is divided 
by p and the remainder substituted), or as is the same thing where use 
is to be made of the relation A p = B p = C p . . . =1, and (ii) where the 
power of the first letter or any symbol is forced to be unity, by taking 
when necessary the power of the symbol which will make this the case. 
To give a concrete example, in the 3 3 system with factors a, l, and c, 
two of the four three-factor interaction pairs of degrees of freedom are 
ABC and ABC 2 : their generalized interaction consists of 

ABC • ABC 2 - A 2 B 2 C 3 = A 2 B 2 = (A 2 B 2 ) 2 = A*B* - AB, 
and 

ABC • A 2 B 2 C 4 = A 3 B 8 C S = C 2 - 0. 

If then a 3 3 factorial experiment is arranged in blocks of three plots 
so as to confound ABC and ABC 2 , AB and C will also be confounded. 
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(b) The yields of treatment combinations in terms of effects and 
interactions. In the case of the 2” factorial experiment, it is well 
known that the yield of a treatment combination may be expressed in 
terms of effects and interactions. Thus if three factors, a , b, c, are 
tested, the yield of the treatment combination at bj c& is given by the 
expression: 

mean + J[(-1)*- 1 A+(-1 )*-*B + (-1)*" 1 C + (- l) i+i AB + 

(-1 y+*AC+ (-i y+*BC+ abc], 

where the effeets and interactions have been reduced to a single-plot 
basis. 

A is here defined as the difference between the mean yield of plots 
receiving factor a at the unit level and the mean yield of plots receiving 
the factor at the zero level; or in algebraic terms 

-A-i(«-l)(» + l)(c + l), 

where the algebraic expression is to be expanded and the correspond¬ 
ing yields substituted for the treatment combinations; the interaction 
AB is likewise defined as 

AB = J(a-1) (b -1) (c +1), 

and so on. 

In the general ease with factors a, b, c, . . . each at p levels, effects 
or interactions cannot be represented by a single difference, since there 
are in all (p -1) differences. In this paper, the mean yield of the plots 
for which factor a was at level i is denoted by ( A)i . In the ease of 
interactions we denote by (A a B^C y )ai + f3j + yk, the mean yield of those 
plots for which 

qXi + /3x 2 + yx z * ai + j + yk (mod p ). 

which is therefore the mean yield of the plots that enter into the partic¬ 
ular component of the interaction ABC, specified by the plot atbjCjc . 
Thus, considering the plot in the 3 s system which receives the treatment 
combination and the interaction ABC 2 , 

<&i + £#2 + y#3 = 1(1) +1(2) + 2(1) =5 = 2 (mod 3), 

and the quantity (A a B^C y )ai + /3j + yk becomes ( ABC 2 ) 2 and is defined 
to represent the mean yield of those plots for which. 

%i + x 2 + 2x z = 2 (mod 3). 

"With this notation, the yield of the treatment combination aibjCic . . . 
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is given by 

mean + (A) i+ (B) /+ (0), . . . 

+ (*4J3)i+i + (-dB 2 )i + 2i+ • • * + (AB P ~ 1 ) 

+ (4C)i + fc+ (-^.C 2 )f + 2fc+ ... 4- (AC P ~*) i + (p_i>fc 
+ («?),♦* + (£C*) i+2 * + . . . + 

+ (ABO)i + / + fe+ (ABC 2 )i + j+2k + . . . + C 1 *' 1 )i + (p_D/ + (p.i)fc +. . , 

where the suffices are all reduced modulo p. 

That this gives the same as the above formula for the 2” system may 
be seen by noting that: 

A= (A) t - C4 ) 0 = 2[(4)i-mean] 

and so on. 

m. TRUE FACTORIAL DESIGNS AND QUASI-FACTORIAL DESIGNS 

The above formulation of the factorial scheme is particularly useful 
in the consideration of true factorial designs. In the particular case 
when p = 2 , the formulation is identical with that which has always been 
used to devise systems of confounding and fractional replication. In 
the case when p = 3, the present formulation may be used with advan¬ 
tage to simplify the discussion of confounding and fractional replica¬ 
tion for the system. 

In the true factorial system, when n individual factors are tested, 
use is made of the fact that main effects and interactions between a 
small number of factors are of importance, but that interactions be¬ 
tween many factors are likely to have negligible values in comparison 
with the experimental error. It is therefore possible to arrange the 
experiment in blocks, confounding between blocks interactions which 
the experimenter considers unimportant, with a resultant increase in 
precision. 

In the quasi-factorial design, on the other hand, the effects and inter¬ 
actions are purely formal and all are of equal importance, as the table 
of variety means which is finally required can only be obtained with 
knowledge of all effects and interactions, no matter what order. It is 
necessary to adopt some formal confounding within each replicate in 
order to reduce the experimental error by reducing the size of block. 

Within any one replicate, some of the effects or interactions are 
obtained by comparison of block (or row or column) totals and the re¬ 
mainder by comparisons within blocks. The comparisons of block 
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totals is subject to an error denoted by the inter-block error, and com¬ 
parisons within blocks are subject to the intra-block error. In general, 
the inter-block error is considerably greater than the intra-block error 
[4]. In order that estimates of reasonable precision may be made for 
all the effects and interactions it is necessary that different effects and 
interactions be confounded in the different replicates. 

IV. DESIGNS FOR p n VARIETIES IN BLOCKS OF p PLOTS 

An outline of the possible designs, considering any number of repli¬ 
cates is now presented. 

(a) n = 2. If the factors are denoted by a and b, then arrangements 
in blocks of p plots exist which confound one of the following effects or 
interactions: A, B, AB, AB 2 , . . . , AB*- 1 . The case in which A is 
confounded in one replicate and B in another replicate is called the 
two-dimensional lattice with two sets, or the simple lattice. A design 
in which A is confounded in one replicate, B in another, and one of 
AB, AB 2 , AB 8 , . . . , AB*- 1 in a third is called the triple lattice. It 
is possible to have p +1 replicates in all, when each effect and inter¬ 
action is confounded in one of the p +1 replicates and unconfounded 
in the other p replicates. 

(b) n = 3. Let the factors be a, b, c. It is possible to use blocks of 
p plots, with p 2 blocks randomized within each replicate, the effects and 
interactions which are confounded with blocks being chosen from A, B, 
AB, AB 2 , . . . , AB*- 1 , C, AC, AC 2 , . . . , AC*- 1 , BC, BC 2 , . . . , 
BC*- 1 , ABC, ABC 2 , . . . , AB 2 C, . . . , AB*- 1 C*~\ The group of 
interactions confounded in any one replicate will consist of two effects 
or interactions with their generalized interaction as defined above. If 
for example the effects A, B are chosen to be confounded in a replicate, 
then so must the interactions AB, AB 2 , . . . , AB*- 1 . It is clear that 
several choices are possible and later in this paper particular examples 
are discussed. If, as is generally the case, intra-block information is 
required on all effects, at least three replicates must be used. 

(c) General n. Let the factors be denoted by a, b, c, .... Then 
a replicate in p n_1 blocks of p plots could be obtained by confounding, 
for example, all the effects and interactions of (p-1) of the factors. 
The principles on which designs are based will become clear from the 
consideration of particular cases. In general, at least n replicates must 
be used. This paper does not deal with designs for a number of vari¬ 
eties which is the power of a non-prime, but it will be clear from the 
discussion that n-dimensional lattices with n replicates of k*- 1 blocks 
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of k plots for a number of varieties k n f where k is not a prime, may be 
constructed by confounding each pseudo-factor in all but one of the 
n replicates. The number of different replicates possible depends on 
the properties of kxk Latin Squares. Some of the properties are 
discussed by Finney [10]. 

(d) p 3 system or 3-dimensional lattice in blocks of p. (i) Design. 

Let the factors be denoted by a , b , c, each having 3 levels for purposes 
of illustration. Then it is necessary in each replicate to confound 
8 [« (p 2 -1) ] degrees of freedom. The possible schemes of confounding 
are given in Table 1. 

The order in which the replicates are given in Table 1 is logical 
and arises as follows. Consider first effect A to be confounded; if B 
is also confounded, so then are AB and AB 2 , and if either AB or AB 2 is 
confounded then so is the other and B; next, the group A, C, AC, AC 2 
is found: the next possible interaction to be confounded along with A is 
BC and this results in the confounding of ABC and AB 2 C 2 : finally 


TABLE 1 

Possible Confounding Systems foe 33 System in Blocks of 3 
Effect or Interaction 


System 

B 

a 

AB 

.45* 

c 

AC 

AC- 

BC 

BC* 

ABC 

ABC J 

AB-C 

ABK- 
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X 
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X 
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12 




X 


X 


a 

! 


X 



13 




X 



X 

J 

■ 

"*1 

X 
























































62 


BIOMETRICS MARCH 1948 


if BG 2 is confounded along with A then so are ABC 2 and AB 2 C: all the 
main effects and interactions have then been confounded with A in one 
of the replicates. Consider nest the effect B to be confounded: it has 
already been considered with effect A and interactions AB and AB 2 so 
we may proceed from C across the table as with A previously. In this 
way the 13 (in general p 2 + p +1 for a p 3 system) possible systems of 
confounding of the 3 3 system in blocks of three have been generated. 

The constitution of the block for a particular system of confounding 
may be determined from consideration of the definition of the effects 
and interactions. The constitution of the blocks on the case when A, 
B, AB and AB 2 are confounded is simply obtained. A less obvious case 
is considered here, say when AB 2 , AC BC and ABC 2 are confounded. 
Let the varieties be denoted by citifa where each i, j, k run from 0 to 2. 
Then the combinations which enter into groups of three blocks are as 
follows: (the suffices i, j, k are used to denote the particular varieties.) 


fi + 2i = 0 (mod 3): 000, 001, 002, 110, 111, 112, 220, 221, 222 

AB 2 \ i + 2j = 1 (mod 3): 020, 021, 022,100, 101, 102, 210, 211, 212 

[i + 2j « 2 (mod 3): 010, Oil, 012, 120, 121, 122, 200, 201, 202 


AC 


i + k 
i + k 
i 4* k 


= 0 (mod 3): 000, 010, 020, 102, 112, 122, 201, 211, 221 

= 1 (mod 3) : 001, Oil, 021, 100, 110, 120, 202, 212, 222 

= 2 (mod 3): 002, 012, 022, 101, 111, 121, 200, 210, 220 


It is easy to see that the nine blocks are : 


I 

II 

III 

IV 

V 

VI 

VII 

VIII 

IX 

000 

001 

002 

020 

021 

022 

010 

Oil 

012 

112 

110 

111 

102 

100 

101 

122 

120 

121 

221 

222 

220 

211 

212 

210 

201 

202 

200 


The contrast represented by AB 2 for example is then given by the con¬ 
trast of blocks I + II + III, blocks IV + V + VI, and blocks VII + VIII 
+ XL 

In planning variety trials of the type considered in this paper for a 
particular number of replicates, it is important to choose systems of 
confounding (as given, for example, in Table 1) so that the total amount 
of confounding is spread as far as possible over all the possible effects 
and interactions. The information available on any particular effect 
or interaction will be of two types (a) intra-block, that is, based on 
within-block comparisons and (b) inter-block, that is, based on com¬ 
parisons among block totals. In general, information of type (a) will 
be more accurate since block variability, measured by differences among 
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blocks after taking account of variety differences, will be large com¬ 
pared with the within-block variability [4]. The experiment is there¬ 
fore designed to yield as much information as possible of type (a). 

If three replicates are to be used, a number of choices of equal value 
are available which, by permutation of symbols,, are equivalent to the 
choice of replicates consisting of numbers 1, 2, and 5: that is, with 
A, Bj AB and AB 2 confounded in one replicate; A } C, AC and AC 2 con¬ 
founded in a second replicate; and B } C, BC and BC 2 confounded in a 
third replicate. If four replicates are to be used, the best choice con¬ 
sists of the following; 


number 1 

confounding 

A, 

B, 

AB, 

■ab 2 :. 

number 2 


. A 

C r 

AC, 

AC 2 

number 5 

(C 

B , 

C, 

BC, 

BC 2 - 

number 9 

t ( 

AB, 

AC , 

BC 2 , 

AB 2 C 2 


Replicate number 10, 12, or 13 may be used in place of number 9. 
With this choice, three main effects and three interactions are con¬ 
founded in two out of the four replicates, and four interactions. are 
confounded in one of the four replicates., The remaining three inter¬ 
actions are completely uneonfounded. : 

In the same way the best choice for any number of replicates up to 
13 may be considered. The systems of confounding are rearranged in 
Table 2 so that the best design (or the more nearly best) for r replicates 
is given by the first r rows of the table. The table was constructed on 
the principle that all the effects and interactions should be confounded 
in as equal a number of replicates as possible: on investigation it was 
found that there is no unique order but the order given is nearly the 
best under all circumstances. 

(ii) Analysis . A method for the k 3 system with 3 replicates con¬ 
sisting of numbers 1, (or X), 2, (or Y), and 5, (or Z) or multiples of 
these three replicates has been described by Yates [8]. It is not the 
purpose here to give a detailed computational procedure for all the 
possible eases, but merely to indicate the structure of the analysis and 
of the procedure for estimating adjusted varietal means. The analysis 
is given in terms of the concepts used in this paper. The confounding 
was as follows: 


replicate 

X: 

A, 

B, 

AB, 

AB 2 , 

<< 

T: 

A, 

c, 

AC, 

AC 2 , 

a 

Z: 

B, 

G, 

BC, 

BC 2 . 


The structure of the analysis of variance is given in Table 3: 
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TABLE 2 

Schemes op Compounding in Order op Value 
Effect'or Interaction 



B 

B 

AB 

AB 2 

B 

AC 

A<7* 

BO 

BC 2 

ABC 

ABC 3 

AB-C 

AB*& 

I 

B 

B 

X 

X 

n 









2 

LiJ 

□ 



B 

B 

B 







5 

B 

B 



X 



X 

X 





9f 



X 



X 



X 




X 

IS 




X 



B 


X 

X 




12 

B 

B 


X 


X 


X 



X 



to 



X 




X 

X 




X 


11 




B 

B 







X 

X 




X 


B 





X 

X 



X 

.... .»*** 

X 







X 


X 



X 

* m 


X 




X 




X 


X 


r 


X 





X 




X 


X 


X 








X 


X 

X 



TABLE 3 

Structure op Analysis op Variance for 3 Replicates 
op pa System in Blocks op p Plots 



DF 

General 

p = 3 

Expectation inf Mean Square 
General 

Replicates 

2 

2 


Blocks component (a) 

S(p-l) 

6 

crt s + pea* 

component (b) 

S(P- 1 ) 

6 

<rt 2 + l/3p<n> 2 

component (c) 

3(p -1) 2 

12 

vi s + 2/3per& 2 

Varieties (ignoring blocks) 

P*~ 1 

26 


Error (intra-bloek) 

2p 3 - 3p 3 4-1 

2S 

(Ti® 

Total 

Mrnmgmmm 

SO 
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The degrees of freedom and sum of squares for blocks component (a) is 
obtained by noting that the blocks of replicates X and Y may both be 
grouped to estimate the effect A; the interaction of these two replicates 
and the A effect contributes two degrees of freedom: comparison of 
effect B in replicates X and Z , and of effect C in replicates Y and Z each 
contribute two degrees of freedom, making a total of 6. The blocks 
component (b) is obtained by comparing the mean A effect in replicates 
X and Y against the unconfounded A effect in replicate Z, and corres¬ 
pondingly for effects B and G , giving 6 degrees of freedom. The 
blocks component (c) is obtained by comparing each interaction which 
is confounded in one replicate only with the unconfounded interaction 
estimated from the other two replicates ; each comparison yields two 
degrees of freedom and the six interactions altogether give a total of 12 
degrees of freedom. The sum of squares for varieties ignoring blocks 
is computed directly from the variety totals, and the sum of squares for 
replicates in the usual way. The error term is obtained by subtraction. 

The expectations of the mean squares of interest as stated by Yates 
[8], are given in the last column of Table 3 in terms of the within-block 
variance 2 and the additional between-block variance o-& 2 . 

The evaluation of effects and interactions for this case proceeds 
exactly as described by Yates [8], but since only effects and two factor 
interactions are confounded it is convenient to state the method in a 
more general form. Consider the effect A for example: it is con¬ 
founded in replicates X and Y but is unconfounded in replicate Z . 
Estimates may be made for each of the replicates to make up a table of 
the form: 



U) o 

(A), 

(A), 

Rep. X 

(A) ox 

U)ix 

(A) 2X 

Y 

(A) or 

(A)ir 

(A) 27 

Z 

(A) 0Z 

(A) 1Z 

(A) zZ 


For the comparisons of (A) 0 , (A) 1? and ( A) 2 in replicates X and J, 
which are inter-block comparisons, each quantity is subject in the gen¬ 


eral case to variance — j p<n? + <n 2 j while for replicate Z the compari- 


sons are intra-block, each quantity having a variance af/p 2 . If two 
quantities a and are independent estimates of a parameter with vari¬ 
ances a 2 and </ 2 , respectively, the linear function of a and ft which is 
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replicate the totals of plots for which the level of pseudo-factor a is 
zero, for which it is one, and for which it is two, may be evaluated and 
denoted by A' 0i A\, A' s respectively. Similarly, obtain for b and c the 
totals B' 0 , B’ 2 , and C' 0 , C\, C' 2 , and for each interaction, quantities 

like ABC'o , ABC\, ABC' 2 . The results may be arranged as in Table A 
Asterisks are inserted to denote the effects and interactions which 
are confounded in each replicate. Final estimates of these effects and 
interactions are obtained by forming weighted means of estimates in 
each replicate across the table, using weights iv when the effect is un- 
confounded and weights w' when it is confounded, i.e., when there is an 
asterisk. This table is also useful in the evaluation of the sum of 
squares for blocks adjusted for varieties. Component (a) is obtained 
by taking, for each effect which is confounded in two replicates, the 
differences between the values in the two replicates and obtaining the 
sum of squares between the three differences with the appropriate nu¬ 
merical divisor. Component (b) is obtained by constructing for each 
of these effects twice the unconfounded values minus the sum of the 
confounded values and evaluating the sum of squares for the differences 
of these three quantities. Component (c) is obtained by evaluating for 
each effect confounded in one replicate, twice the confounded effect 
minus the sum of the unconfounded effects in the two replicates, and 
evaluating the sum of squares between the three quantities, again yield¬ 
ing two degrees of freedom for each effect. 

Having obtained estimates of effects and interactions which utilize 
the inter-block information, the adjusted variety totals may be obtained 
by the use of the formula given above. This is computationally a 
tedious process and the value of Yates’ scheme of computation lies in 
the fact that this is simplified by the representation of a cube in two 
dimensions, with the necessity only of adjusting tables involving two 
factors. The method is described here not necessarily as the best com¬ 
putationally, but as the one which gives clearly the structure of the 
analysis and which may be extended to deal with any number of repli¬ 
cates for prime power lattices. The great bulk of the work of the 
analysis of variance and calculation of variety means may be dealt 
with simply by the use of punched-card machines. The evaluation of 
the variance of differences between variety means and the discussion of 
the efficiency of the designs are presented in a later section. 

(e) p 4, system or 4-dimensional lattice in blocks of p. For the pur¬ 
poses of illustration let the pseudo-factors be denoted by a } b, c, d, each 
having 3 levels. As indicated in the construction of Table 1 for the 3 3 
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system, it is necessary in the enumeration of possible schemes of con¬ 
founding only to designate effects or interactions which may be used to 
generate the whole group of effects and interactions which are con¬ 
founded. The 40 (equals p 3 + p 2 + p + 1) possible schemes are desig¬ 
nated in Table 5. 


TABLE 5 

Schemes of Confounding foe 3* System 
in Blocks of 3 Plots 

Generators of group of confounded interactions 


Number 

Generators 

Number 

Generators 

1 . 

A, B, C 

21. 

B, AC 2 , AD 

2. 

A s Bs D 

22. 

B, ACS, ADs 

3. 

As Bs CD 

23. 

ABs C, D 

4. 

A, Bs CD* 

24. 

AB, C, AD 

5. 

A, Cs D 

25. 

AB, C, ADs 

6. 

- As Cs BD 

26. 

AB, AC, D 

7. 

A, Cs BDs 

27. 

AB, AC, AD 

8. 

As BC, D 

28. 

AB, AC, ADS 

9. 

A, Be, BD 

29. 

AB, ACS, D 

10. 

As BCs BD a 

30. 

AB, ACS, AD 

11. 

A, BCs, d 

31. 

AB, AC 2 , AD* 

12. 

As BCs, BD 

32. 

AB*, C, D 

13. 

A, BCs, BD* 

33. 

AB*, C, AD 

14. 

Bs Cs D 

34. 

AB*, Cf AD* 

15. 

Bs Cs AD 

35. 

AB*, AC, D 

16. 

Bs Cs ADs 

36. 

AB*, AC, AD 

17. 

Bs AC, D 

37. 

ABs, AC, AD* 

18. 

B, AC, AD 

38. 

ABs, AC 2 , D 

19. 

B, AC, ADS 

39. 

ABs, ACS, AD 

20. 

Bs AC*, D 

40. 

AB*, AC*, AD 


The minimum number of replicates which may be used is four, X, 
7, Z, W, say, the confounding being 

replicate X: A, B , C and their interactions 

“ 7: A , B, D “ 

“ Z: A , C, D “ 

“ W: B, C, D “ “ 

Each main effect is then confounded in three of the four replicates. 

The structure of the analysis of variance is given in Table 6. 

Blocks component (a) will consist of the comparison between replicates 
of effects which are confounded in those replicates. Thus each of the 
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effects A, B, C, D yields 4 degrees of freedom, with a total of 16. Those 
for A are the interaction of the table 

(-l)o (^-)l (- 1)2 

replicate X: - - - 

Y: 

Z: 

TABLE 6 

Structure op Analysis op Variance por p * Varieties in 
Blocks op p Plots with 4 Beplicates 




DF 

General 

P = 3 

Expectation of Mean 
Square 

General 

Replicates 


3 

3 


Blocks Component 

(a) 

8(p-l) 

16 

<n s +p <nP 


(b) 

6(p-l)2 

24 

a s +p ciP 


(c) 

4(P-1) 

8 

a- +1 p <T6* 

4 


(d) 

6<p-D* 

24 

2 

t p at? 

4 


(e) 

4(p-l)« 

32 

3 

+- p trip 

4 

Varieties ignoring blocks 


p^-1 

80 


Error (intra-block) 


3p<-4p»-i-l 

136 


Total 


4p<-l 

323 



Blocks component (b) arises from interactions which are confounded 
in 2 of the 4 replicates. Each of the interactions AB, AB 2 , AC, AC 2 , 
AD, AD 2 , BC, BC 2 , BD, BD 2 , CD, CD 2 yield 2 degrees of freedom with 
a total of 24 degrees of freedom. That for AB is obtained as the inter¬ 
action of the table. 


(AB) 0 (AB) X ( AB) 2 

replicate X: - - - 

Y: 


Component (c) arises from the comparison of the mean confounded 
effects and mean unconfounded effects and has 8 degrees of freedom in 
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all: for A, for example, the following table may be constructed. 

Aq 

replicate X: ~ 

7: 

Z: 

W: .. ^ 

3 (rep.W)-(repJT + 7 + Z): 

The comparison of the three quantities in the last row of the table yields 
2 degrees of freedom. Component (d) arises in the same way from the 
2-factor interactions and has in all 24 (* 12 x 2) degrees of freedom. 
Finally component (e) arises from the interactions which are con¬ 
founded in only one replicate. For each, the effect in the replicate in 
which it is confounded is considered and also the effect in the other 
three replicates. From these a comparison yielding 2 degrees of free¬ 
dom is obtained. There are 16 such interactions yielding 32 degrees of 
freedom in all. The expectation of the mean squares in terms of <r& 2 
and at 2 as previously defined is given in the last column of the table. 
The sum of squares for the several block components may be pooled and 
used with the mean square for error in order to estimate <r& 2 and cr* 2 and 

hence w =-^r, the weight of intra-block estimates, and w' =——^the 
gC cri +P<td 

weight of inter-block estimates. 

From this stage the analysis could be made in the same way as 
described for 3 3 system and there is no need to describe it here. Not 
more than 4 replicates are dealt with here, as this extension presents no 
difficulty. 

(f) p 5 system or S-dimensioml lattice in blocks of p. This case is 
.considered as a further illustration of the principles, again with p * 3. 
Let the factors be a, b, c,d, e. The minimum number of replicates is 
5 and the confounding would then consist of the following 

replicate V: A, B, C, D and all their interactions 

W: A , B, C, E “ “ “ 

X: A , B, D, E “ “ “ 

7: A, C, D } E “ “ “ 

Z: B, C, D } E il “ “ 

The portion of the analysis of variance dealing with blocks with 
400(=5x80) degrees of freedom is given in Table 7. 
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TABLE 7 

Partition of Block Degrees of Freedom for ps Varieties 
in Blocks of p ¥ITH 5 Replicates 

DF Expectation of Mean 



General case 

0 = 3 

Square 

Comparison "between replicates of 
effects or interactions which are 
confounded in these replicates. 




Main effects 

15(0—1) 

30] 


Two-factor interactions 

20(0-1)3 

80 L 

cr*s + pa& 

Three-factor interactions 

1O(0-1) 8 

80 j 


Comparisons between mean of 
confounded effects and mean of 
unconfounded effects. 




Main effects 

5(0-1) 

10 

ff<2 + 3 0(71,2 

5 

Two-factor interactions 

10(0-1)3 

40 

2 

Ci £ + — P<n> 2 
5 

Three-factor interactions 

10(0-1)3 

80 

3 

OH 2 -j- _ pab* 

5 

Four-factor interactions 

5(p—l) 4 

80 

4 a 

<Ti~ + — Pffb- 

5 


The entries in the last column in the second part of the above table 
are obtained by use of the general rule that the expectation of the mean 
square derived from the comparison, when the effect or interaction is 
confounded in n c replicates and unconfounded in n u replicates, where 
blocks of p plots are used, is 




n u + n c 


p °v 


V. MEAN VARIANCE OF ADJUSTED VARIETAL COMPARISONS 

Any comparison between a pair of varieties may be expressed in 
terms of the effects and interactions, using the notation above. Thus in 
the 3 s system for example, the comparison of varieties 000 and 001 may 
be expressed as follows: 

2/000-2/001 = [(G) o- (■ C) 1 ] + [(AC) 0- (AC),] + [(AC 2 )o- (AC 2 ) 2 ] 

+ [(BC) o- (BC) a] + [(£C 2 )o- (BC 2 ) 2 ] 

+ [( ABC) Q -(ABC ) x ] + [(ABC 2 ) 0 - ( ABC 2 ) 2 ] 

+ [(AB 2 C) 0 - (AB 2 C),] + [(AB 2 C 2 ) 0 - (AB 2 C 2 ) 2 ] 

Furthermore, the contrasts in the square brackets are independent of 
each other, so that their variance may be added to give the variance of 
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the variety difference. If three replicates have been used confounding 
A, By ABy AB 2 in the first, A, C, AC, AC 2 in the second, and B, C, BC, 
BC 2 in the third replicate, then the variances of the effects and inter¬ 
actions comparisons are as follows: 


effects A, B, C have a variance 7777 =- 77 , 

9 9(2 w + wy 


interactions AB , AB 2 , AC, 


AC 2 , BC, BC 2 have a variance of 777 - tt-k* and interactions ABC, 

9 9 9(m> + 2 w'Y 


ABC 2 , AB 2 C, AB 2 C 2 have a variance of n/0 v . 

9(3w) 

The variance of the above variety comparison is then 


2f 1 4 

9 [w + 2 m/ + 2m; + m/ + 3w J * 

The possible 26 comparisons of a variety with a chosen one will have 
differing accuracies depending upon whether the comparison involves 
one, two, or three of the main effects, but for practical purposes a single 
measure of accuracy for all comparisons, namely the mean variance of 
all the independent comparisons, is sufficient. This variance may be 
obtained very easily for prime lattice designs with any number of 
replicates by noting that each effect or interaction enters into the same 
number of the comparisons of one variety with the remainder. In the 
particular case of the 3 s system, each of the 26 independent comparisons 
involve nine effects or interactions: 4 effects or interactions do not enter 
into any particular comparison—in the above case, these are A, B, AB 
and AB 2 . Each effect or interaction comparison enters into 18 of the 
total of 26 comparisons, so that the mean variance for all comparisons is 
equal to 
10 

[7(A) + 7{B) + 7(C) + 7{AB) + 7{AB 2 ) + 7{AG) + 7{AC 2 ) 

+ 7(BC) + 7{BC 2 ) +7 {ABC) + 7 {ABC 2 ) + 7{AB 2 C) 

+ 7{AB 2 C 2 )]. 

where 7 (A) for example is the variance of each of A 0 , Ai, A 2 appro¬ 
priate for comparisons with each other. The mean variance is there¬ 
fore equal to 

1 18 f 3 , 6 4] 

9 ' 26\_w + 2u/ 2m+io /+ 3wJ 

= A|_!_ + _6_ Jtl 

13lw + 2w' 2w + w' 3w J 
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The formula may be generalized immediately to the ease of a p n system 
or the ^dimensional lattice in blocks of p plots with r replicates; the 
mean variance is 


2 (p-l) f ni , n£ 

p n - 1 {r-l)w' 2w+(r-2)w' * ‘ * riv _ 


where 

rh is the number of effects or interactions confounded in (r- 1 ) replicates, 
“ 44 44 44 1 4 44 44 (r- 2 ) 

etc. 

and n r is the number of effects or interactions not confounded in any 
replicate. The n-i must satisfy the two relations, 


v n — 1 

n i + + . . . + n r = ^—t (the total number of effects and 


^ VV WUJL U4..1 1-1 V V* V 

interactions), 

and 

r(v n -' L - 1 ) 

+n 2 (r-2) + . . .+ n r -i = —-- —(the total number 

P ~ 1 of degrees of 

freedom for 
blocks within 
replicates di¬ 
vided by (p - 

D). 

Thus with the 3 3 system with 4 replicates (p = 3, n = 3, r = 4) and con¬ 
founding as follows: 


replicate 1: A, B, AB , AB 2 
44 2: A, C, AC, AC 2 

44 3 : B, <7, BC, BC 2 

44 4 ;AB, AC, BC 2 , and AB 2 C 2 , 

it is seen that ^ « 0 , 

n 2 = 6 , namely A, B, C, AB, AC, BC 2 
n 3 = 4, namely AB 2 , AC 2 , BC, AB 2 C 2 
= 3, namely ABC, AB 2 C, ABC 2 , 
and the mean variance is 


2 [ 6 4 _3_1 

13 [_2w + 2w' + Sw + w' * 4w J 

A frequently used design will be the P* design in blocks of p plots 
with n replicates confounding all main effects but one and all their 
interactions in each replicate. This may be called the w-dimensional 
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lattice for p n varieties with n replicates in blocks of p plots. The mean 
variance in this case is 

2(p — 1) f n + (p-l)»C 2 

p n -1 \_w+ (n- l)w' 2w+(n-2)w' 

{ (p-l) 2 nC 3 , t (g-ir i 

3 w + (»-3)ie; / * mu J 

where equals f/ n ' . f 
^ s!(n-s)! 

VI. EFFICIENCY OF DESIGNS 

In studying the efficiency of lattice designs, 3 comparisons are of 
particular interest : 

(a) comparison with randomized blocks which make no use of con¬ 
founding, 

(b) comparison of information per plot with varying numbers 
and types of replicates, 

(c) comparison of designs with varying sizes of block and hence 
varying amounts of confounding. 

The first type of comparison is in general of only academic interest, 
as it is known that with the recovery of inter-block information the 
lattice design is at least as efficient as the completely randomized design, 
apart from a trivial loss of information due to inaccuracies in weighting 
the confounded and unconfounded effects and interactions. If the 
costs of analysis of the two types of experiment differed by an amount 
appreciable in comparison with the field and other work of the experi¬ 
ments (which are approximately the same in both cases), it would be 
necessary to make this comparison. 

The other types of comparison are of importance in the design of 
particular experiments. Comparison type (a) above may be regarded 
as a special case of comparison type (c), but for present purposes, it is 
considered separately, as comparison (c) will be discussed in a succeed¬ 
ing paper dealing with blocks of size which is a power of p. 

In considering the efficiency of designs, no account is taken of the 
loss in information which results from inaccuracies in the weights. 
The examination of this point by Yates [8, 9] and Cochran [2] indi¬ 
cates that it is of trivial importance. 

(a) Comparison with randomized blocks. The mean variance per 
comparison in the general case of a p n design in blocks of p plots with 
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r replicates has been given above. If the experiment had been ar¬ 
ranged so that each replicate consisted of a randomized block of p* 
plots, the average error variance would be 

p n -p ff ~1 p n -*-i 

w _ w' 

p n -1 

as (p*" 1 -1) degrees of freedom would have a variance of —, and the 

w 

remaining p” -1 - (p"- 1 -1) = p n -p”- 1 degrees of freedom would have 
a variance of 1/w . The mean variance per comparison would be this 
quantity multiplied by 2/r. 

The efficiency of the lattice design relative to randomized complete 
blocks is then 


r(P~ 1 ) 


p n -p n - 1 p*- 1 -1 

w w' 




n-i f n 2 

w + (r-l)w' 2 w-b (r-2)w' 


. + - 


n r 

rw 


If, for example, w/w' equals unity, that is, inter-block and intra-block 
information are of equal accuracy, the above expression reduces to 


P w ~l 

r(p-l) 

. . . +n r ) 
r 


which, as 


7h + n 2 . . . +n r 


7^-1 

P-1 ’ 


is equal to unity. The relative efficiency may be easily computed from 
the above formula, given a system of confounding and a value for w/w\. 

Table 8 deals with only the particular case of the p n lattice in 
blocks of p plots with the least possible number of replicates, namely n. 
In this case, the relative efficiency is 


w(p-l) 


(p n -p n ~ x ) 

w/w' 


+ (p n_1 --l) 


(p-DnOa 


L\w/w' + (n- 1) 2 w/w'+ {n~ 2) 


(P-I) n ~ 

nw/w‘ 


n-l 


Most frequent use of lattice designs is made by plant breeders who 



76 


BIOMETRICS MARCH 1948 


TABLE 8 

Relative Efficiency of ti-Dimensional Lattice Designs for p n Varieties 
Compared to Randomized Complete Blocks (n Replicates 
with Incomplete Block Size = p) 





w / w f 

p 

n 

pn • 

1 

2 

3 

4 

5 

6 

7 • 

8 

9 

10 

2 

2 

4 

100 

109 

125 

143 

162 

181 

200 

220 

239 

259 

2 

3 

8 

100 

110 

127 

146 

165 

185 

205 

225 

245 

266 

2 

4 

16 

100 

110 

127 

146 

165 

185 

205 

225 

245 

266 

2 

5 

32 

100 

110 

128 

147 

166 

186 

206 

226 

246 

266 

2 

‘ 6 

64 

100 

111 

128 

147 

167 

187 

208 

228 

248 

269 

2 

7 

128 

100 

111 

129 

148 

168 

189 

209 

230 

251 

271 

2 

8 

256 

100 

111 

129 

149 

170 

190 

211 

232 

253 

274 

2 

9 

512 

100 

111 

130 

150 

170 

192 

213 

234 

256 

277 

2 

10 

1024 

100 

111 

130 

150 

171 

193 

214 

236 

258 

280 

2 

11 

2048 

100 

111 

130 

151 

172 

194 

216 

238 

260 

282 

3 

2 

9 

100 

107 

120 

135 

150 

166 

182 

19S 

214 

231 

3 

3 

27 

100 

10S 

123 

139 

156 

173 

191 

208 

226 

244 

3 

4 

81 

100 

109 

124 

141 

159 

177 

195 

213 

232 

250 

3 

5 

243 

100 

109 

125 

142 

161 

179 

198 

217 

236 

255 

3 

6 

729 

100 

110 

126 

144 

162 

182 

201 

220 

240 

259 

3 

7 

2187 

100 

110 

120 

145 

164 

183 

203 

223 

242 

262 

5 

2 

25 

100 

105 

114 

125 

136 

148 

160 

172 

184 

196 

5 

3 

125 

100 

106 

117 

129 

142 

155 

168 

182 

196 

209 

5 

4 

625 

100 

106 

118 

131 

144 

158 

172 

186 

200 

215 

5 

5 

3125 

100 

107 

119 

132 

146 

160 

174 

189 

204 

219 

7 

2 

49 

100 

104 

111 

120 

129 

138 

147 

157 

167 

176 

7 

3 

343 

100 

105 

113 

123 

133 

143 

154 

165 

176 

186 

7 

4 

2401 

100 

105 

114 

124 

135 

146 

157 

168 

179 

190 

! 

2 

121 

100 

103 

10S 

114 

120 

127 

133 

140 

147 

154 


3 

1331 

100 

103 

109 

116 

123 

130 

138 

145 

153 

160 

■ 

IS 

2 

169 

100 

102 

107 

112 

117 

123 

129 

135 

141 

147 

13 

; 

3 

2197 

100 

103 

108 

114 

120 

126 

133 

139 

146 

152 


often have no particular number of varieties to compare. For ex¬ 
ample, the number of varieties may be of the order of 500 or 600; in 
order to use an efficient experimental design, the geneticist will con¬ 
strict this number, and if he is considering prime-power lattice 
designs he may use 2*(=512), or 5 4 (= 625), or 23 2 (=529) varieties. 
To determine which choice he should make depends on several consider¬ 
ations among which is the relative efficiency of the two designs. A 
clear-cut answer is not possible because the relative efficiency depends 
on the ratio wfw' in the two cases. 

(b) Comparison of information per plot with different numlers of 
replicates . Earlier in the paper consideration was given to the use of 
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any number of replicates equal to or greater than the number of pseudo¬ 
factors (which have been denoted by n). Particular attention was 
given to the 3 3 system when it was shown that the 13 possible schemes 
of confounding could be arranged, intuitively at least, in an order such 
that the first r schemes would be the best to use if r replicates were to 
be used. Here this problem is considered in detail with reference to 
the 3 3 and 3 4 systems. 

For each number of replicates the efficiency of the design relative to 
complete randomized blocks, may be evaluated, and for particular 
values of w/w' the relative efficiency for varying numbers of replicates 
may be compared. This comparison is equivalent to comparing the 
information per plot of the various designs, as the information per plot 
with complete randomized blocks is independent of the number of 
replicates. 

For the 3 replicates given by the first 3 rows of Table 2, the relative 
efficiency is 


[18/(w/w')+8] 


fi f 3 6 4 1 

_w/w' + 2 + 2w/w' + l + %w/w'\ 

For the 4 replicates given by the first 4 rows of Table 2, the relative 
efficiency is 


[18/ {w/w') +8] 


8 


and so on. 


%w/w' + 2 3 w/w' + 1 4:w/w 


7 


The values of the relative efficiency for numbers of replicates from 
3 to 13 as specified by Table 2 are given in Table 9. 

In the literature on the 3 s system, it is recommended that the num¬ 
ber of replicates be a multiple of 3, consisting of the first three repli¬ 
cates repeated the requisite number of times. From Table 9 a com¬ 
parison may be made of the efficiency of an experiment of this type with 
6 replicates using different systems of confounding with that of an 
experiment in which the confounding is the same in pairs of replicates. 
Of course, if w/w' is equal to unity, nothing is to be gained from the 
use of different replicates but for moderate values of w/w' the efficiency 
is around 10 per cent greater for the design with different systems of 
confounding. 
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TABLE 9 

Percentage Efficiencies of the 3 3 Lattice Design foe Various 
Numbers of Replicates to Randomized Complete'Blocks 


Systems from Table 1 

Values of w/w' 

1 

2 

3 

4 

5 

6 

7 

S 

9 

10 

1,2,5 

100 

108 

123 

139 

156 

173 

191 

208 

226 

244 

1,2,5,9 

100 

109 

125 

143 

161 

180 

199 

218 

238 

257 

1,2,5,9,13 

100 

110 

126 

144 

163 

182 

202 

222 

241 

261 

1,2,5,9,13,12 

100 

110 

127 

145 

164 

184 

204 

224 

244 

264 

1,2,5,9,13,12,10 

100 

110 

127 

146 

165 

185 

205 

226 

246 

266 

1,2,5,9,13,12,10,11 

100 

110 

127 

146 

166 

186 

206 

227 

247 

26S 

1,2,5,9,13,12,10,11,8 

100 

110 

128 

147 

167 

187 

207 

228 

248 

269 

1,2,5,9,13,12,10,11,8,3 

100 

110 

128 

147 

167 

188 

208 

229 

250 

270 

1,2,5,9,13,12.10,11,8,3,6 

100 

111 

128 

14S 

168 

188 

209 

229 

250 

271 

1,2,5,9,13,12,10,11,8,3,6,7 

100 

111 

128 

148 

168 

188 

209 

230 

251 

272 

1,2,5,9,13,12,10,11,8,3,6,7,4 

100 

111 

12S 

148 

168 

189 

210 

230 

252 

273 


For the 3 4 system, it was not considered worth while to make as com¬ 
plete an examination and in Table 10, the relative efficiency of 4, 5, 6 
and 40 replicates are compared. 


TABLE 10 

Relative Information per Plot with 4, 5, 6, and 
40 Replicates of 3* System 


Sets 

w/w' 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

1,2,5,14 

100 

109 

124 

141 

159 

177 

195 

213 

232 

250 

1,2,5,14,27 

100 

109 

125 

143 

162 

181 

200 

220 

239 

259 

1,2,5,14,27,40 

100 

110 

126 

144 

163 

183 

202 

222 

242 

262 

All 40 sets 

100 

111 

129 

149 

170 

191 

213 

234 

256 

278 


Again, when w/w' is small, there is little gain from using more than 
4 replicates (or multiples of 4 replicates). When w/w' is large, gains 
of up to 10 per cent are obtained. This, of course, supposes that the 
same size and shape of plots, blocks, and replicates are used in the 
possible cases. 
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QUERIES 


QTXERT. in our investigations of processes of freezing vegetables, we 
56 prepared samples from the same lot of raw material and subjected 

half to process A and half to a second process B. In order to learn 
if the eventual eating quality is different, and if so which product is 
preferred, the samples were submitted to graders on two days: one 
day AAB was submitted, another, ABB. The order of presentation 
was randomized each day. The grades were asked if they could detect 
any difference between the samples; and if so, which were identical. 
They were also to indicate in each trio the sample or samples they liked 
best. 

Among 150 graders who were present both days, 17 correctly iden¬ 
tified duplicates. Of these 17, one preferred A both days, 7 pre¬ 
ferred B both days, and 8 preferred A one day and B the other. One 
had no preference either day. 

Now we can state one of our questions. What is the value of r in 
the formula, 

, (a-rt)* 
x " r(a + Z>) 

when analyzing the data for those persons who identified duplicates 
both days? ¥e think the grader had one chance in 9 to identify dupli¬ 
cates on both days, therefore, r = 8. 

Now we will approach our second question which relates to the 
preference data for both days. These data obviously support the con¬ 
clusion that there is no important difference between the samples as 
far as these graders are concerned. But what if our data had been 
slightly different, say that 30 people had identified duplicates both 
days but of these 30, 15 preferred A one day but B the other day. 
Would we be justified in concluding from the 30 that the samples were 
different? On the other hand, maybe we should think of the difference 
data and the preference data as two separate categories and analyze 
each separately. What do you think? 

Under the hypothesis that the graders can detect no differ- 
AUSWER* enee between the products, the probability of correct as¬ 
sociation each day is 1/3 so that the probability of success 
both days is 1/9; you are right in taking r - 8 for substitution in the 
formula quoted provided a = 133 and l = 17. As you indicate, the 
evidence against the hypothesis is trivial. 
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The preference data suggest a modification in this conclusion. 
Clearly the person who had no preference either day provides no in¬ 
formation on this point. Of the other 16, half had the same preference 
on both days while the other half changed. So far the results are 
exactly in accord with the hypothesis that A and B are the same. On 
the other hand, of the 8 who had consistent preferences, 7 chose B. 
This is an indication, though the evidence is not strong, that there may 
be a margin of preference in favor of B in the population of graders 
who have consistent preferences. 

I am not sure that I understand the meaning of your second ques¬ 
tion. Certainly, if it is among 150 graders that there are 30 successful 
identifications of duplicates on successive days, then there is strong evi¬ 
dence of a difference between A and B. As to preference, one may 
still set up the hypothesis that this is entirely random. If it is, the 
following statements of preference are equally likely: 


Day 1 Day 2 


A 

A 

B 

B 

A 

B 

B 

A 


I suppose you mean that 15 of the 30 graders stated their preferences 
as either AB or BA. This is the number to be expected under the 
hypothesis set up. To this point, the preference results are similar to 
those in your first question. But no information is given about the 
consistent preferences. If they are somewhat evenly divided, the 
hypothesis of random choice need not be rejected. But if most of 
them preferred B both days, the results of your first experiment are 
accentuated. 

As for the third question, it seems to me that each part of the 
experiment contributes information on the other part. It may be that 
progress can be made toward segregating the insensitive graders from 
the sensitive, especially if it is found that these qualities are general 
rather than specific. If the generally insensitive are gradually elimi¬ 
nated, there is a saving of time and money. Both parts of the experi¬ 
ment are useful in making the separation. Those who both identify 
duplicates and have consistent preferences are promising candidates 
for further testing. 


George W. Snedecor 



THE BIOMETRIC SOCIETY 


The Biometric Society, a very flourishing infant, is gaining daily. 
Any count of members will be far out of date by the time of publica¬ 
tion. As of March 1, although the majority of the 400 members 
were from the Continental United States, 35 from other countries 
represented beginnings in Argentina, Australia, Canada, Czechoslo¬ 
vakia, De nm ark, England, France, Hawaii, India, Italy, the Nether¬ 
lands, Norway, Sweden and Switzerland. 

More interesting than the number of members are the fascinating 
“mathematical and statistical aspects of biology’’ evidenced by mem¬ 
bers interested and working in everything from aviation medicine to 
the several facets of zoology. As if the millenium had arrived and 
lions were in fact lying down with lambs, animal and plant breeding 
geneticists, serologists, nutritionists, psychiatrists, ornithologists, phar¬ 
macologists, biological assayists, sampling and quality control statis¬ 
ticians, pomologists, herpetologists, cancer and radium biologists, for¬ 
esters and geologists seem to be brothers and sisters under the skin 
with the biophysicists, mathematical and vital statisticians, at least in 
the cosmopolitan Biometric Society. 

The annual business meting for formation of the Eastern North 
American Region was held in two parts, the first at the Stevens Hotel 
in Chicago, December 27, and the second at the Commodore Hotel in 
New York, December 30, 1947. 

Geoffrey Beall presided at the Chicago meeting which took the fol¬ 
lowing action: 

1. After consideration the proposed By-laws for the Region 
were adopted with a few minor changes. 

2. A slate of officers for 1948, was sent to the Nominating 
Committee in New York to be considered with other nominations.* 

3. A Resolution of Appreciation to the American Statistical 
Association was indorsed unanimously: 

The Eastern North American Region of the Biometric So¬ 
ciety, at its first business meeting, wishes to record the indebt¬ 
edness of American biometricians to the American Statistical 
Association. The continuing support of the Association has 
made it possible for biometrics to develop into a recognized 
field in America. All biometricians owe much to the Associa¬ 
tion for its organization of the Biometries Section, and the 
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founding of Biometrics. It is the hope of the Eastern North 

American Region to establish and maintain the most cordial 

relations with the American Statistical Association. 

4. A motion authorizing the Regional Committee to affili ate 
as soon as feasible with the ASA, AAAS and NEC Division on 
Biology and Agriculture carried unanimously. 

5. A motion that meetings in 1948 be arranged jointly by the 
Biometrics Section of the ASA and the Eastern North American 
Region of the Society was carried unanimously. 

After some discussion of desirable meetings, it was agreed that 
everyone present approved such cooperative endeavors but the 
details should necessarily be left to the Regional Committee. 

6. After considerable debate on the manner in which the spirit 
might be best implemented, it was unanimously agreed that “the 
meeting in Chicago gives proxy to the acting secretary of the New 
York meeting to make any minor modifications he may see fit in 
action taken in Chicago and to make decisions subject to confirma¬ 
tion or reversal at the next annual business meeting. ” 

C. I. Bliss, acting for D. B. DeLury, presided at the New York 
meeting, which by virtue of a common agenda had the objective 
of completing action started at Chicago. 

1. After discussion of the modified, proposed by-laws, and a few 
voted changes, a motion was carried to accept the following: 

As a subdivision of the Biometric Society, the Eastern North 
American Region is governed by the Constitution of the Society and 
the following By-laws. 

1) Scope . The Region shall be concerned with biometrical 
activities, as defined in Article 1 of the Constitution, in that part 
of the United States and Canada lying east of approximately 104° 
longitude. 

2) Affiliations. By action of the Regional Committee, ap¬ 
proved by the Council of the Society, the Region may affiliate 
itself with national organizations in the fields of biology, mathe¬ 
matics, and statistics. Such affiliation shall be reported to the 
membership at the next annual meeting for confirmation. 

3) Regional Committee . The Vice-President of the Society 
for the Region, who shall serve as its chairman, the Regional Sec¬ 
retary-Treasurer and 6 ordinary members shall constitute the 
Regional Committee, all to be elected by the Council of the Bio¬ 
metric Society. The Regional Committee shall be responsible 
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for the affairs of the Region between annual meetings. Each an¬ 
nual meeting shall submit nominations for a Vice-President, a Sec¬ 
retary-Treasurer, and two ordinary members for 3 year terms, to 
the Council. No ordinary member shall be nominated to succeed 
himself. 

4) Regional Advisory Board. There shall be a Regional Ad¬ 
visory Board, composed of the Regional Committee, members 
designated to represent the Region in other organizations, chair¬ 
men and other designated representatives of committees, and such 
other members as the Regional Committee shall appoint. The 
Regional Advisory Board shall advise the Regional Committee on 
matters of policy and the appointment of committees, and shall, 
through its members, help to keep the Regional Committee in touch 
with the interests and wishes of all members of the Region. 

5) Annual Meeting. There shall be an annual meeting of the 
Region at a time and place determined by the Regional Committee. 

6) Associate Members. Members of organizations with which 
the Region is formally or informally affiliated who subscribe to 
Biometrics shall be associate members of the Region. They shall 
have all privileges except those of voting and holding office. 

7) Program Committees. The Vice-President shall appoint 
standing program committees with rotating membership to arrange 
meetings in connection with those of specific organizations or 
groups of organizations. Each such committee shall report its 
tentative plans for the coming year at the annual meeting. Each 
independent meeting of the Region shall be arranged by a special 
program committee appointed for the purpose. 

8) Nominating Committee. At the annual meeting, the Vice- 
President shall appoint a Nominating Committee of four mem¬ 
bers, which shall present its nominees for the Regional Committee 
to the next annual meeting. 

9) Amendment. These By-laws may be amended by a two- 
thirds vote of those attending the annual meeting. Each pro¬ 
posed amendment shall be placed on the agenda. 

2. The Resolution of Appreciation to the American Statistical 
Association, previously indorsed in Chicago, was indorsed unani¬ 
mously. 

3. A motion was carried authorizing the Regional Committee to 
affiliate as soon as feasible with the ASA, the AAAS and NRG Division 
on Biology and Agriculture. 
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4. The Chicago motion that meetings in 1948 be arranged jointly 
by Section and Region was confirmed. 

5. The Nominating Committee (H. Muench, L. Knudsen and C. 
Eisenhart) presented a slate of officers. After considerable discussion 
and nominations from the floor, the following officers for 1948 were 
elected for Council confirmation: 

Vice-President: Charles P. Winsor 

Secretary-Treasurer: John H. ’Watkins 
Regional Co mm ittee • 

1948-1950: N. Rashevsky; H. D. Landahl 

1948-1949: Margaret Merrell; Phillip J. Rulon 

1948: E. J. deBeer; A. E. Brandt 

The British Region, which has twenty charter members, held an 
organization meeting in London on January 28. A provisional com¬ 
mittee (J. W. Trevan, E. C. Fieller, R. A. Fisher, J. B. S. Haldane, 
K. Mather, Eric Wood) was appointed to draft regional rules for 
submission to a later inaugural meeting, and to draw up a list of those 
who should be invited to join. Announcements concerning the Society 
are being sent to Nature and other suitable journals. 

E. A. Cornish, organizing the Australian Region, has sent relevant 
material to everyone in Australia likely to be interested and does not 
think there will be any difficulty at all in setting up the region. Helen 
Turner is acting as Secretary-Treasurer. Cornish has forwarded a list 
of twenty-one who wish to become charter members in the Society. 

A. Buzzati-Traverso has forwarded to the Secretary’s office a list 
of nine colleagues who have asked to become members of the Society. 
He is getting in touch with Georges Teissier with a proposal that the 
Italian members join the French Region. 



ANNUAL MEETING OF THE BIOMETRICS 
SECTION 

The annual business meeting of the Biometries Section was held at 
the Commodore Hotel, New York City, at 2 P.M., December 30, 1947. 
C. I. Bliss, presiding in the absence of the Section Chairman, reported 
on the First International Biometric Conference at Woods Hole, Mas¬ 
sachusetts. The Proceedings of this conference appeared in the De¬ 
cember issue of Biometrics. 

The secretary, H. W. Norton, reported that the Section had grown 
to 1040 members and 240 associate members. 

The presiding chairman presented a report of the joint business 
meeting with the Biometries Society, held at the Stevens Hotel in Chi¬ 
cago on December 27, 1947. This report, pertaining principally to the 
formation of the Eastern North American Region of the Biometric 
Society, is contained in the report of the Society elsewhere in this issue. 

Gertrude Cos, chairman of the Editorial Committee of Biometrics , 
reported on the increase in the annual subscription from one to two 
dollars and discussed the revised format and increased size of the 
journal. 

A joint meeting with the Pharmacological Society in Atlantic City 
on March 15-19, 1948 was announced, and a motion was carried that 
meetings during the forthcoming year be arranged jointly with the 
Eastern North American Region of the Biometric Society. 

The Nominating Committee, C. P. Winsor, chairman, A. E. Brandt, 
P. T. Bruyere, Lila F. Knudsen, Hugo Muench and Churchill Eisen- 
hart, presented a slate of officers for 1948, who were elected unani¬ 
mously. The new officers are: Chairman, Joseph Berkson; Secretary, 
John H. Watkins; Section Committee, D. J. Finney, Margaret Merrell, 
P. J. Rulon and J. W. Tukey; Editorial Committee, Gertrude M. Cos, 
chairman, C. I. Bliss, W. G. Cochran, Churchill Eisenhart, J. W. Fer- 
tig, H. C. Fryer, Horace Norton, A. M. Mood, G. W. Snedecor, and 
Jane Worcester. 
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STATISTICAL SUMMER SESSIONS AT THE UNIVERSITY 
OF CALIFORNIA 

Berkeley, California 

Following the encouraging esperience of last year the University 
of California offers statistical programs in two S umm er Sessions of 
1948. The teaching staff is as follows: 

Raj Chandra Bose, Professor of the University of Calcutta, 
India. 

Miss Evelyn Fix, Lecturer at the University of California, Ber¬ 
keley. 

Erich L. Lehmann, Assistant Professor of the University of 
California, Berkeley. 

Michel Loeve, Reader at the University of London, England. 

Jerzy Neyrnm, Professor of the University of California, Ber¬ 
keley. 

Abraham Wald, Professor of Columbia University, New York. 

Courses in statistics are offered on both the graduate and the under¬ 
graduate levels. The graduate courses, all given during the first Sum¬ 
mer Session, are meant primarily for students who either have already 
obtained their Ph.D. degree or are working towards it. Therefore, 
apart from formal classes, it is proposed to hold extensive seminars in 
which the work of students will be discussed. No specific prerequisites 
to graduate courses will be required. However, to benefit from the 
courses, the students must be generally familiar with the theory of 
statistics. In addition, course 272 and especially 271 will require a 
reasonable knowledge of the theory of functions. 

There will be two undergraduate courses offered, course S12 during 
the first Summer Session, June 21 to July 31, and course S113 during 
the second Summer Session, August 2 to September 11. Both of these 
courses were recently introduced into the curriculum and are pre¬ 
requisites to more advanced courses in statistics. They are offered 
during the Summer Sessions for the benefit of students, otherwise 
advanced, who plan to attend more advanced courses in statistics dur¬ 
ing the fall semester. Besides, course S12 is recommended for students 
who do not intend to specialize in statistics but wish to acquire some 
knowledge of this subject as a part of their general education. 

The Statistical Laboratory will be available for students doing re¬ 
search. 
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FIRST SUMMER SESSION 


S12. Elements of Probability and 


Statistics. 

Mr. Lehmann. 

271. Random Functions. 

Mr. Loeve 

272. Sequential Analysis. 

Mr. Wald 

273. Design of Experiments. 

S290s. Seminar in Theory and Sta¬ 

Mr. Bose 

tistics. 

Mr. Loeve, Mr. Wald 

290t. Seminar in Design of Experiments 

Mr. Bose 

S295. Individual Research. 

Mr. Bose, Mr. Loeve, 
Mr. Neyman, Mr. Wald 


SECOND SUMMER SESSION 

S113. Second Course in Probability 

and Statistics. Miss Fix 

Other details may be had upon request to the Summer Sessions 
Office, University of California, Berkeley 4, California. 



NEWS AND NOTES 


George W. Snedec orv President of the American Statistical Asso¬ 
ciation and Research Professor of Statistics at Iowa State College, will 
be Visiting Research Professor of Statistics at Alabama Polytechnic 
Institute during the Spring Quarter, from March 22 to June 4, 1948. 
He will lecture on Statistical Experimental Design and will be avail¬ 
able for consultations. The newly formed Statistical Laboratory at 
A. P. I. also will offer a course in Survey Sampling during the Spring 
Quarter to be taught by the Director, T. A. Bancroft. Conferences 
in applied statistics for research workers in the lower southeastern 
states are being scheduled during the time of Mr. Snedecor’s visit. . . 
Horace Norton, formerly with the Washington office of the Weather 
Bureau, reported November 3, for duty with the Atomic Energy Com¬ 
mission. His new work is in connection with accountability for source 
and fissionable materials . . . Carl F. Kossack keeps us guessing 
where to locate him. At last report he is with the Department of 
Mathematics at Purdue University, Indiana . . . Bliss H. Crandall, 
Director of the Statistical Laboratory, Utah State College, Logan, is 
spending six months at the Institute of Statistics, Raleigh, North 
Carolina. He is visiting classes and assisting with consulting and 
analytical work ... A number of biologists in the New York Metro¬ 
politan area have diagnosed themselves as being statistically under¬ 
nourished and have banded together to form a study group designed 
to alleviate this deficiency. Self-medication consisting of “t’s”, re¬ 
gression coefficients, variances, and assorted statistical treatments are 
administered at bi-weekly meetings. If you are interested in joining 
this clinic, either as an observer or as a patient, get in touch with 
Edwin J. de Beer, The Wellcome Research Laboratories, Tuckahoe 7, 
New York . . . J. I. Northam from the University of Michigan and 
Marily Spanglet from the University of California are now at Kansas 
State College helping teach statistics. H. C. Fryer has a statistical 
program operating at the Agricultural Experiment Station and Kansas 
State College, Manhattan . . . F. J, Anscombe, Rothamsted Experi¬ 
mental Station, has been appointed lecturer in the Faculty of Mathe¬ 
matics, University of Cambridge . . . The Calcutta Statistical Asso¬ 
ciation was started towards the end of 1945 with the object of promot¬ 
ing the cause of statistics in post-war India in all possible ways. Their 
Bulletin states, “There is a greater need in our country than elsewhere 
of training up competent statisticians and educating the government 
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and the public on the usefulness of reliable statistics and their dispas¬ 
sionate analysis in solving the diverse problems of our life.” Such a 
need is not confined to India. In fact, the statisticians of India are 
taking an active part in promoting national and international statis¬ 
tical programs. P. C. Mahalanobis is a member of the Statistical Com¬ 
mission set up by the Economic and Social Council of the United 
Nations. He is Chairman of the Sub-Commission on Statistical 
Sampling . . . R. C. Bose, Calcutta University, is a Visiting Pro¬ 
fessor at the University of North Carolina where he is teaching a 
course in Multivariate Analysis. In the Spring Quarter he will give 
a course in Design of Experiments. During the Fall he was at Colum¬ 
bia University, New York . . . D. Mangeron, Director of the Mathe¬ 
matical Institute of the Polytechnical High School of Jassy, Roumania, 
writes, “Let us rejoice that this period of destruction is come to an 
end and the universal desire among scientists to see better international 
scientific contacts in the coming years begin to become a reality.” 
They are soliciting original articles of Mathematics, “Fisics,” Chem¬ 
istry, and Technics for publication and are requesting extracts or re¬ 
prints of publications destinated eventually to a review in their Bul¬ 
letin . . . Joseph Carmin, Director, Independent Biological Labora¬ 
tories, Kefar-Malal, P. 0. Ramatayim, Palestine, reports, “A building 
was erected at Kefar-Malal, an agricultural settlement ... It is a 
pleasure to be able to state that the moving and the rearranging is 
almost all over and that we are ready now for research work; we are 
able to accommodate a dozen workers with working facilities.” 
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TIIE STATISTICAL ORGANIZATION OF NERVOUS ACTIVITY* 

Wa.rrmx S. McCulloch 
Walter Pitts t 

University of Illinois College of Medicine, 

Illinois Nenrop'Sychiatric Institute, ( 1 hicago 


INTRODUCTION 

T his is not a review of neurophysiology but a synopsis of some 
theories which may lead to an understanding of the mental aspects 
of nervous activity, namely ideas and purposes. The highway to ideas 
lies through statistical conceptions from their logical foundation in 
Boolean algebra through modern methods of constructing invariants by 
averaging over groups of transformations. Purposive behavior depends 
upon how output affects input which, in turn, depends upon a nervous 
system whose organization can be treated statistically. This is instanced 
in one reflex. Known details of other mechanisms are in current publica¬ 
tions. The theory is extremely atomistic. The ultimate units of nervous 
activity are impulses which, being all-or-none signals, submit to the 
Boolean algebra of propositions and hence to statistical treatment. A 
field-theory does not now exist and may never cope with the inherent 
complexities. It has been shown to be unnecessary. 

Figure 1 shows a neuron, and labels the dendrites , cell-body and 
axon. Proper irritation of the coll starts off a signal, a ring of negative 
voltage, which then travels from the body along the axon and its branches 
at a speed between one foot and three hundred feet per second, depending 
upon the thickness of the axon. The thicker axons are also longer, so 
that the total time of transit is more nearly the same from the beginning 


k This work was aided by a grant from the Jowiah Mtiey, Jr. Foundation 
fjohn Simon Guggenheim Fellow for 1947 
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to end of any axon than if all were of one diameter. We shall treat this 
time as if it were constant, and therefore negligible. The end of an axon 
is either in a muscle or gland, or else forms a small knob on another 
neuron, as in Figure 2. These knobs are called synapses. Signals arriv¬ 
ing at synapses irritate the recipient neuron locally for about two-tenths 
of a millisecond. If signals arrive within that time on enough of its 
synapses, they combine to start off a signal half a millisecond later along 
the axon of the recipient neuron. The amount of irritation required for 



Real Netem (one type) 

t '■ * ■ 

FIGURE 1 



this is called the threshold of the neuron, and the delay the synaptic delay. 
After transmitting one signal, a neuron will not transmit another for 
about eight-tenths of a millisecond: it is said to be refractory. It is 
; obvious from this that no two successive signals along the same axon 
' can combine their irritation at the terminal synapse. Although the 
anatomy is not known, impulses arriving somewhere in the vicinity of a 
?; neuron, either directly or by way of sub-threshold irritation of inter- 
toediary neurons, do prevent the neuron in question from responding to 
otherwise adequate irritation. We draw such an inhibitory ending as a 
loop around a dendrite, as in Figure 2. Neurons are unlike ordinary 
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electric circuits in that the energy sustaining the signal is always sup¬ 
plied locally; that is why its final size is not affected by events at its 
origin or along its course. Neurons have other properties which we shall 
ignore in the present sketch. 

Let time flow equally in measured lapses, say a millisecond apiece, 
and number them beginning with any one that is convenient. A given 
neuron cannot transmit two signals in a single lapse: it must have either 
one on it or none. For every such lapse there is therefore one proposi¬ 
tion, say S A (t) for neuron A , such that knowledge of its truth or falsity 
describes the neuron completely—namely, S A (t) asserts that there is a 
signal on A at t. Further, since neurons influence one another only by 
signals, all the significant relations within a nervous net can be expressed 
as propositional relations which only involve truth-values. This is to,say 
that nervous activity can be described in the calculus of propositions as 
follows. If, for two propositions p and q, we use the notations: 

~ p = ‘p is false’, ‘not p 1 
p + q = ‘either p or q or both’ 
p * q = ‘both p and q’ 
p D q = 'if p then q\ l p implies q’ 
p ss q as l p if and only if q\ 

then possible relations between the actions of neurons in Figure 2 will 
include the cases: • -Mv 

1) . Simultaneous summation from both A and 0 is necessary to 

to excite Z>: 

S D (t + 1) ES S A (t)'S c (t). 

2) . Either A or C is alone capable of exciting D: - 

S D (t + 1) 55 S A {t) + S c (t). - 

. , 3). A can excite D, unless B inhibits it: 

s D (t + 1 ) m s A (t) — s B (t). . 

In the whole-nervous net we shall have an equivalence of this 
ihg the conditions of excitation for each neuron in the net. Provided the 
net is free of circular paths—that is, if it is nefer possible to Ioflowdo#U ■ 
the axon of a neuron and its successors in such a way as to return to ; ishe\ 
stating point—then these equivalences may be substituted into one 
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another so as to obtain, for each output-neuron, a set of necessary and 
sufficient conditions of excitation, in the form prescribed by this calculus, 
in terms of the signals coming into the system as input from sense- 
organs. If we are allowed extra delay between input and output, we can 
construct a net to make the output any desired logical function of the 
input, provided only it satisfy the condition that it is false when all its 
atomic components asserting the occurrence of signals in the input are 
false. For if no signals ever enter the net, none can emerge. If spon¬ 
taneously active neurons are admitted, this restriction also vanishes. 

If neurons successively excite one another in a circle, a signal once 
started can circulate through the net indefinitely. Among other things, 
such circuits introduce the universal and existential operators of logic, 
applied to time past. They constitute a memory of a kind, whereby, in 
principle, signals delivered once to a net may cause it to behave differ¬ 
ently to certain inputs forever after. The actual durability of learning 
requires more stable devices than this, amounting to a change in the 
connections of the net: but this may not come formally to anything very 
different. To make marks and read them later also brings consequences 
formally similar to circulating activity in the net. 

Besides these microscopic properties, the ten billion neurons in the 
brain show regularities in the large which are properly statistical, and 
are necessary for a nervous system to survive and reproduce. 

One kind extracts the important universal out of the excessive 
particularity of their exemplars. An anin^al must recognize visible 
objects irrespective of his distance from them and his perspective—that 
% independent of their absolute size or position in the visual field. The 
latter invariance he secures by a reflex which snaps the eyes to the 
“center of brightness”, and the image therewith to a standard place.. 
This is one general kind of mechanism. A second secures the size-invari- 
aneeofshapes: the nervous system may actually form all the possible 
magnifications and constrictions of the image, either simultaneously at 
different places or successively at.one place, calculate an important 
parameter for each size, and add them. Such a sum would have been the 
same, by definition, if we had started with the same shape in a different 
size. Enough invariant sums of this kind may be computed to enable the 
system to recognize the form as well as it needs to. Since, in a finite 
net the number of such transformations is finite, these sums are really 
averages over groups of transformations. 1 


: ^Mechanisms differ from, one another in the calculation of parameters to be averaged. Thus, in 

effect, the reflex that centers gaze on objects to be recognized, by moving the eye so rapidly as to flush 
the visual cortex with “ons” and “offs”, assigns there the value zero to all translations except the last, 
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Another statistical principle of organization occurs wherever the 
nervous system, built on all-or-none principles, has to deal with the 
important variables in the physical world that are continuous along one 
or several dimensions. Light varies continuously in intensity, hue and 
shade; sound, in loudness and pitch; and so on. The nervous system 
represents these magnitudes as averages of many kinds. It averages 
over time when a sense-receptor emits a series of impulses whose fre¬ 
quency measures the intensity of the continuous variable stimulating it, 
or when a muscle fiber in tetanus adds increments of tension evoked by ♦ 
signals along the innervating axon during some time past. It likewise 
averages in space when a higher grade of a sensory variable stimulates 
more receptors, or more motor axons excite more fibers in a muscle. This 
is one reason for the enormous reduplication of parallel paths in the 
nervous system. The result of all this averaging is a very fair approxi¬ 
mation to a continuous dynamical control-system for gaging the applica¬ 
tion of physical force to move matter in the light of continuous infor¬ 
mation about the consequences. 

These matters are well illustrated in the simple case of the stretch- 
reflex. With some simplification the mechanism is diagrammed in 
Figure 3. Receptors in the muscle send signals into the spinal cord at a 
frequency p which is some monotonic function/(L) of the length of the 
muscle. These signals are reduplicated in branches of the sensory axons 
carrying them, to converge on the motor cells of the ventral horn which 
innervate the same muscle. The motor neuron will transmit a signal 
whenever the number of afferent signals coinciding on it within a short 
interval exceeds the threshold h. We shall take this interval as the unit 
of time. If afferent impulses are statistically independent and asyn¬ 
chronous, the probability-distribution of the total number arriving per ,; 
unit time will tend either to the Gaussian or the Poisson distribution, 
depending upon the magnitude of p. In the former case, if N be the 
average number of different axons afferent to one motor neuron, the 
mean will be Np and the variance Np( 1 - p), so that the average 
number of signals per unit time delivered to the muscle along a motor 
axon will be 

m “ er/ [wr^“ Ar ' / ']’ 

erf(x) = f. «“ <arVS> dx, 

which therefore alone determines the average. We can not conceive any mechanism for detecting 
universals which may not be described in this general manner of averaging' over groups. 
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in which h f * h/N, the relative threshold. E(p ) is sigmoid, monotonic, 
and varies from zero to unity as p does, provided that b! < l. 2 We see 
that the inflection point of E(p) is reached as p reaches the relative 
threshold h r , and that its slope at that point is proportional to the* 
square-root of N. This mean frequency E(p), delivered over the axons 
innervating the muscle, will develop a tension given by a certain mono¬ 
tonic function T = T(E), which then tends to reduce the length of the 
muscle. This familiar process, whereby a change in the output causes a 
change of opposite sign in the input, sets equilibrium length L and a 
tension T which just holds that length against the external load. It 
also returns the muscle to that length if it be perturbed from without in 
anyway. 

It is evident that the equilibrium length sought by the reflex under 
given circumstances can be varied at will by controlling the Value of the 
central threshold h'. Formally, this is exactly the effect wrought by 
additional signals descending from higher nervous structures to intervene 
in the stretch-reflex arc. The engineer would say that the signals from 
higher structures control the gain around the loop of the stretch-reflex. 
Quite generally, this is the plan of sub- and super-ordination prevailing 
in the nervous system. No higher structure alone can move muscles; 
it can only control the “central amplification” of the elementary spinal 
reflex arc. In monkeys, to cut off all the afferents from a limb acutely 
paralyzes it as completely as if the motor nerve had been severed. 

Three or four principal circuits send parallel descending tracts to 
control the spinal cord in this way. Some of them proceed from their, 
own sensory afferents and in turn have their own gains controlled -tag#; ■! 
super-ordinate systems. Thus the labyrinths inform the vestibule^ , 
spinal circuit which direction is down, whereupon it amplifies the stret^4 ? 
reflex in the anti-gravity muscles accordingly. Similar circuits control - 
the velocity of movement, to keep it smooth and in constant relatioh to 
moving physical objects. Others keep the body at even temperature^^ 
the blood pressure constant, and the respiration sufficient to hold tWj 
carbon dioxide and oxygen tensions at proper values. Many of them a%; 
regularly periodic, like those of walking, breathing and sleeping. v ■ 

There are circuits that pass from the central nervous system through 


^Actually, numerous complications beset this simple line of argument. Refractoriness causes t$(p) 
to level off at an asymptote short of unity, and inhibitory impulses arrive from antagonistic muscles/ 
“associative” neurons in the spinal cord and higher structures. Lloyd’s faeiKtation and inhibition 
at this synaps must likewise be included. Even so, the great generality of reasoning based on the 
limit-theorem of Laplace and Liapunoff permits us to take such influences into aooount in thesan^e 
way. The resulting function is provisionally concordant with the pitifully few E(p) curves iii thb 
physiological literature. 
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effectors into the world about us to procure the necessities of life: and, 
of them, some, making use of symbols, keep us adjusted to the complica¬ 
tion of society. When two are incompatible, choice is insured either by 
one inhibiting the other or by a requirement of summation from the 
rejected to the preferred. Since, of three such circuits, the first may 
dominate the second, the second the third and the third the first, values 
need have no common measure. When these circuits are built into us by 
the usual processes of growth, they operate so automatically that we are 
scarcely aware of them. Experience of choice usually arises at the 
moment we are forced to make a novel decision. 

At present we do not know how our nets are changed by such de¬ 
cision. We try many things and finally succeed; the successful mode of 
action most commonly becomes the preferred: but whether this is due to 
growth of neurons or changes in threshold is obscure. Heredity cannot 
fix the thresholds and connections of so many neurons. It can only lay 
down the general plan and leave particulars to chance. Experience 
brings order into this chaos, and in doing so gives us a memory unlike 
a written record. It is better conceived as the establishment of a con¬ 
nection, which, once made, works henceforth so that the new is always 
built upon the old. This gradual ordering of the nervous system is like 
permanent magnetization in an originally unmagnetic bar of steel. 
Apart from learning, a mathematical account of nervous systems whose 
connections are random in detail is part of the difficult and incomplete 
realm of statistical mechanics that deals with change of state. Even so, 
numerous calculations of quantities measured in experimental electro¬ 
physiology have been made, and there is reason to expect more in the 
near future. 

We can summarize our conclusions as follows: 

1) The actions of neurons and their mutual relations can be de- 
, scribed by the calculus of propositions subscripted for time, 

2) The nervous system as a whole is ordered and operated on 
statistical principles. Thereby it adjusts the ajl-or-none laws 
governing its elements to a physical world of continuous varia¬ 
tion. 

3) It detects universals. 

4) It conserves its own level of activity, the condition of the body 
it inhabits, and its relation to the physical world by activity in 
closed paths such that a change in its output causes a change of 
opposite sign in its input. 
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5) It chooses between ends, 

6) It alters its structure by experience. 

Finally, the mathematical treatment of its activity presents numer¬ 
ous problems in the theory of probability and stochastic processes. 
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EXPERIMENTAL DESIGN IN COMPARISON 
OF ALLERGENS ON CATTLE 

F. M. Wad lev 

U. S. Department of Agriculture 

Tuberculosis and Johne’s disease in cattle may be diagnosed by the 
injection into the skin of appropriate allergens (tuberculin and johnin) 
prepared from cultures of the organisms which cause these diseases. 
Animals which are or have been diseased show a reaction to the allergens 
in the form of a thickening of the skin immediately surrounding the site 
of injection. Unfortunately, when using current preparations of the 
allergens, cows with tuberculosis may exhibit some reaction to johnin, 
and vice versa. Even though the intensity of reaction is greater to the 
' allergen of the disease from which the animal is suffering, some confuson 
in diagnosis results. The Pathological Division of the Bureau of Animal 
Industry has been carrying out studies which have as their purpose the 
development of allergen preparations more specific for each disease, and 
at the same time potent. The general results of these studies will be 
published elsewhere, but certain points relative to the experimental 
designs may be summarized here. 1 

In the experimental work on this problem, animals artificially sensi¬ 
tized to either tuberculin or johnin have been employed. The intensity 
of reaction to the allergen preparations may be measured by the amount 
of skin thickening produced using the technique of Dr. H. W. Johnson of 
the Bureau (1944). In preliminary tests it was found that a satisfactory 
degree of reaction to the allergen and a satisfactory measurement of the 
; skin thickening eould be obtained on the neck-flank, back and upper and 
lister side regions of the sensitized animals (Figure 1). Further, a total 
separate injeetions could be distributed over these regions, 
'■■with reactions to the individual injections occurring independently. 

Before starting the experimental program proper, the importance of 
various factors which might affect the response to allergens was studied 
in a series of uniformity trials. To illustrate the major findings, data 
from one of these trials is cited. Here a preparation of johnin was 
, injected into johnin-sensitized cows. The material was-administered at 
two concentrations in each of four regions (Figure 1) on each side of five 
cows. Summarized results and analysis are given in Tables 1, 2, and 3. 


V ^The material is taken from notes on the consulting work done by the Experimental Design 
@(»0einittee of the UJ&.D.A. in connection with the project. 
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* ■ . :y- ■' ' TABLE 1 

AVERAGE REACTIONS IN MM. (COWS AND CONCENTRATIONS) 


Cow No. 

Concentration 

1 

2 3 

4 

5 

Average 

1/1.00- 

6.72 

10.97 10.84 

7.72 

6.88 

8.63 

1/1000 

3.19 

5.12 5.28 

2.91 

3.53 

4.01 



TABLE 2 




AVERAGE REACTIONS IN MM. (BODY REGIONS AND CONCENTRATIONS) 


' ... 


Region 



Concentration 

Back 

Neck-Flank 

Upper Side 

Lower Side 

Average 

1/100 

* 5.78 

10.95 

8.40 

9.38 

8.63 

1/1000 

3.20 

5.10 

3,58 

4.15 

4.01 


TABLE 3 

ANALYSIS OF VARIANCE OF REACTIONS 


Source of variation 

Degrees of 

Mean 


freedom 

square 

Cows 

4 

40.60 

Regions 

3 

43.81 

Concentrations 

1 

426.66 

Interactions 



Cotv X region 

12 

5.13 

Cow X cono. 

4 

5.24 

Cone. X reg. 

3 

10,17 

Cow X cone. X reg. 

12 

4.79 

Sides 

1 

11.44 

Interaction, cow X Bide 

4 

7.87 

Other interactions with side 

35 

1.72 


The reactions in this trial were unusually high, but otherwise the 
findings illustrate the conditions to be encountered. Of particular im¬ 
portance were the marked differences between cows and between regions. 
These differences were consistent from trial to trial, the back region 
giving low reactions, the neck-flank region high reactions, and the skfed 
regions average reactions (Figure 1). Reactions within a region were 
fairly homogeneous. The effect of varying the allergen concentratibh- 
was here, and in the other trials, very pronounced relative to experi ¬ 
mental error. A tendency to interaction of concentration and region 
was present in this trial, but its magnitude in the remaining uniformity 
trials led to the conclusion that failure to consider it in the design and 
analysis of the experiments would not lead to erroneous interpretations. 
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The cow by concentration interaction was here negligible, but estimates 
made from the series of experiments indicated that it was a small source 
of experimental error. Side variation was here not significant, and the 
results of the whole series of trials warranted the assumption that the 
left and right sides do not differ in response. 

It will be noted in Tables 1 and 2 that a ten-fold dilution approxi¬ 
mately halves the response, suggesting that the response is not a linear 
function of concentration. This relation was studied further in the 
other uniformity trials using series of concentrations, and it was found 
to be definitely curvilinear. It became essentially linear over a wide 
range, however, when logarithms of concentration were employed. 
Although not illustrated here, it was also learned in the series of uni- 
fdrmity trials that operators differed but little and that measurement 
errors were relatively small as compared to other sources of experimental 
error. 
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These results indicated that a practical minimum for experimental 
error could be obtained if comparisons of allergen preparations were 
made on an intra-cow, intra-region basis, it being noted that comparable 
regions on the left and right sides of a cow may be considered as a single 
region. 

The experimental program involved the comparison of numerous 
allergen preparations. In order that potency may be properly assessed 
and specificity ascertained, at least two concentrations of each prepara¬ 
tion must be administered. As few as two concentrations may be used, 
however, only when the preparations being compared are known to be 
qualitatively similar and the potency of each is already approximately 
known. When faced with uncertainty as to qualitative similarity and 
potency as well, three, four or even more concentrations of each prepara¬ 
tion may be required in order to obtain the desired information. 

Usually, each of the four regions on a given side of a cow will furnish 
8 to 12 usable injection sites, although less than this number may some¬ 
times be available. Combining comparable regions on the two sides 
would thus yield homogeneous blocks with 16 to 24 injection sites. If 
the numbers of preparation-concentration combinations to be tested do 
not exceed 16 to 24, there might be employed a design of the randomized 
“complete” blocks type, in which all treatments occur in all regions. 
This would yield a minimum experimental error. On the other hand, the 
number of preparations and concentrations desired may require more 
injection sites than are available in a region. The use of a “complete” 
blocks design would then necessitate that the blocks overlap two or 
more regions, with an accompanying increase in experimental error. 
The increase in block size and, therefore, in error might be avoided by 
the use of an “incomplete” blocks design. In such a design only a part 
of the treatments are compared in each region, the particular set of 
treatments compared being different in each region. A “balanced 
incomplete” blocks design is to be preferred for the present purposes, 
since equal variance for all treatment contrasts is highly desirable, and 
computational complexities are avoided in the analysis. A balanced 
design is characterized by the occurrence somewhere within a block ql 
every pair of treatments the same number of times. In the experimental 
program, experiments of both the “complete”"and ^incomplete” blodks 
type were used. An illustration of each is discussed in the following 
paragraphs. 

A COMPLETE BLOCKS DESIGN 

In this experiment a standard tuberculin preparation was compared 
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with a modified form. Three concentrations (250, 25 and 234 parts per 
10,000) of each were used" thus 6 treatments were studied. Fourteen 
replications were made on each of 5 cows, each replication falling within 
one of the regions of varying sensitivity. Treatment averages were com¬ 
puted for each cow, and the analysis shown in Table 4 was then con¬ 
ducted on the resulting 30 means (6 treatments on each of 5.cows).. 
Measurements of skin thickness were recorded in millimeters (mm.). 

TABLE 4 

ANALYSIS OF VARIANCE, STANDARD VS. MODIFIED TUBERCULIN 



Degrees of 

Mean 


freedom 

square 

Between cows 

4 

4.11 

Between treatments 

5 

26.34 

Cow X treatment (error) 

20 

0.27 


The basis for comparison of two materials is the ratio of concentra¬ 
tions required for the same effect. If one substance requires twice the 
concentration of another substance, it is only half as good. Bliss and 
Marks (1939) have discussed, with references, the biological assay 
method appropriate to studies in which response is a linear function of 
the logarithm of concentration. In this method, the ratio of concentra¬ 
tions required for a given effect is determined as the difference of log- 
concentrations. Geometrically this is the horizontal distance between 
the regressions of response on log-concentration for two materials being 
compared (Figure 2). If concentrations have been selected so that the 
differences between the logarithms of successive concentrations are 
constant, and if the regressions are linear and parallel, the estimation of 
relative potency is quite simple. 

The tests for linearity and parallelism following the scheme of Bliss 
and Marks are shown for the present study in Table 5. This scheme 
divided.the sum of squares for treatments in Table 4 (5 X 26.34) into 5 
independent parts which are associated with the 5 effects listed in the 
lower part of the left hand column of Table 5. The figures in the “total 
reactions” row- are simply the sums of the individual cow averages for 
each treatment. The figures in the “net sum” column are the sums of 
products of the “total reactions” and the corresponding coefficients in 
their respective columns. For example, the net sum for “Between 
materials” is 

(~1)(29.24) + (—1X15.08) + (-1)(5.«7) + (1)(33.77) + (1)(19.03) 
;i,;- • • + (i)(7.05) = 9.96. 







105 


COMPARISON OP ALLERGENS ON CATTLE 

The “Divisors” are obtained by multiplying the sum of squares of the 
coefficients in the corresponding row by the number of figures added to 
obtain the total reactions. For example, the divisor for “Between 
materials” is 

5{(-l) 2 + (-1) 2 + (-1) 2 + (l) 2 + (l) 2 + (l) 2 } - 30 

The figures in the “S.S.” (sum of squares) column are derived as (net 
sum) 2 /divisor. These are also mean squares since each has but one 
degree of freedom associated with it. As a computational check, it 
should be noted that, except for rounding errors, these sums of squares 
total to the treatment sum of squares in Table 4 (5 X 26.34). 


Awage 

Reaction 




FIG. 2.—Common-slope regression lines fitted to data used in Tables IV and V. Log of potency “ratio 
is represented by dotted line. 
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TABLE 5 

SINGLE-DEGREE ANALYSIS OF TREATMENTS 


Material 


Standard 


Modification 

Net sum 

Divisor 

as . 

Cone,, parts per 
10,000 

250 

25 

2H 

250 

25 

2 M 


i 


Total of reaetions 
mm. 

29.24 

15.08 

5.57 

33^77 

19.03 

7.05 

Between materials 

-1 

-1 

-1 

+1 

+1 

+ 1 

4* 9.90 

30 

3.31** 

Linear regression 

+1 

0 

-1 

+1 

0 

-1 

+50.39 

20 

126.96** 

Non-parallelism 

+1 

0 

-1 

-1 

0 

+1 

- 3.05 

20 

0.47 

Curvature 

+1 

-2 

+1 

+1 

-2 

+1 

+ 7.41 

60 

0.92 

Opposed curvature 

+1 

-2 

+1 

-1 

+2 

-1 

+ 1.89 

60 

0.06 


♦’•'Highly significant 


Each sum of squares (or each mean square) in Table 5 may be referred 
to the general error in Table 4 (0.27). The low mean squares in the last 
three rows show that the regressions were essentially linear and closely 
parallel. This is also apparent from the plotted points in Figure 2. The 
occurrence of parallelism as well as linearity has been the usual experi¬ 
ence in the present studies, where qualitatively and quantitatively 
similar allergen preparations have been compared. Usually a priori 
knowledge allowed the selection of concentrations such that the average 
response from preparation to preparation did not vary greatly, and such 
that the responses obtained were in the range of linearity and of the 
greatest sensitivity to a change of concentration. With preparations 
which were qualitatively dissimilar or differed -fridely in potency, a lack 
of parallelism was sometimes noted. In such cases relative potency can 
be stated for specific levels only. 

In the present example the log of the ratio of potency can be derived 
from the formula of Bliss and Marks, ( KID)/B ; where D and B are the 
square roots of "the “Material” and “Regression” sums of squares, 
respectively; K is a constant depending on the number of concentrations 
employed (here 1.633); and I is the interval of log-concentration (here 1). 
The log-difference is thus (1.633)(l)((3.31 1/2 )/126.96 l/a , or 0.264. The 
modification is, therefore, estimated as being 1.84 times as potent as the 
standard. 

The standard error of the log-ratio can be estimated here as 
(s/b). (2/n') 1/2 ; where sis the standard deviation of an individual cow- 
treatment mean (0.27 1/2 ); b is the common regression coefficient of 
reaction on concentration (estimated as net-sum/divisor for regression, 
©r 50.39/20); and n' is the total number of cow-treatment means for one 
^preparation (here 15). This gives an estimate of 0.075. In general 






107 


COMPARISON OF ALLERGENS ON CATTLE 

where 5 cows, with 10 to 15 replications per cow, have been used, and 
where the variation in materials has been similar to that in the present 
example, standard errors have been 0.10 or less. Thus, a log difference 
of 0.20 or more (corresponding to potency ratios greater than 1.59 or less 
than 0.63) would appear as significant. When 10 cows, with one repli¬ 
cation per cow, and substances varying more widely have been used, 
standard errors have been 0.20 or higher. 

The difference in standard errors just noted raises a question as to the 
relative influence on experimental error of several sources of variance. 
Experimental error in these studies may be considered as stemming from 
three sources: (1) cow by treatment interaction (between-cow) (2) region 
by treatment interaction, and (3) cow by region by treatment interaction 
plus measurement errors, etc. (within cow). In experiments of the 
“complete” blocks type, where the same number of replications appear 
in a given region on all cows, source 2 may be considered as a non-random 
variable, and therefore not a source of error. This requires, of course, 
that regions be well enough defined so that they are adhered to closely on 
all cows. The relative importance of variance sources 1 and 3 can be 
estimated from experiments like the ones for which error sizes were just 
cited. These estimates indicated that source 3 (within-cow) accounted 
for at least 70% of the total variance attributable to sources 1 and 3 
combined. 

In the example given, a priori knowledge allowed the selection of a 
number of concentrations (three) at the levels which placed them in the 
useful part of the response range. Thus, satisfactory comparisons were 
obtained. Under, some conditions as few as two concentrations may be 
successfully used but with materials of unknown or widely varying 
potency, more concentrations are desirable. 

AN INCOMPLETE BLOCKS DESIGN 

In this experiment it was desired to compare 16 allergen preparations. 
Knowledge concerning the materials was such that it was desirable to 
study at least 4 concentrations of each, spaced at four-fold interval^ 
Thus there were 64 treatments to be compared. The field workei# 
wished to use only 16 injection sites in each of the 4 compound response 
regions previously described; i.e., 64 sites per cow. In each of those 
regions, or blocks, 4 of the 16 preparations were injected, each a,t aji 4 
concentrations. Thus a complete replication could be placed on a single 
cow, and by using the “balanced lattice” plan for 16 treatments, every 
treatment could be brought together within a block with every other 
treatment an equal number of times in 5 replications. Ten cows were 
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used, thereby furnishing two complete duplications of the basic design. 

Part of the 16 preparations were of tuberculin and part of johnin. 
In order to study specificity, therefore, all 16 preparations were com¬ 
pared on 10 tuberculin- or “bovis”-sensitized cows in one test. In a 
second test all preparations were compared on johnin- or “para”-sensi- 
tized cows. The results for the first test are cited here. 

In evaluating the data, the log-concentration required to yield a skin 
thickening of 3 mm. was estimated by regression from the four concentra¬ 
tions of each substance in each block separately. The resulting 160 
figures were then analyzed in the standard manner for a balanced lattice 
with two duplications. The. analysis is shown in Table 6. 

TABLE 6 

ANALYSIS OF VARIANCE, LOOS OF CONCENTRATION REQUIRED 
FOR 3 MM. REACTION, BOVIS-SENSITIZED 



Degrees of 
freedom 

Sum of 
squares 

Mean 

square 

Cows 

9 

21.25 

2,36 

Treatments (ignoring blocks) 15 

169.23 

11.28 

Blocks (adjusted) 

30 

17.79 

0.59 

Error (intra-block) 

105 

31.20 

0.30 


The fact that the adjusted block mean square is almost twice the size 
of the intra-block error mean square indicates that this design yielded 
more precise comparisons than if all 64 treatments had been randomly 
distributed over each cow disregarding response regions. Unfortunately, 
however, through a misunderstanding, the blocks as outlined were not 
placed exactly as intended on the regions of varying sensitivity. This 
lowered the precision to some extent. Nevertheless, in the case of the 
bo vis-sensitized cows, the lattice design showed a gain of 8% over the 
jf^fresponding randomized complete blocks design. In the case of the 
ptoreensitized cows little gain was observed. 

;; la a later ferial with 16 strong and similar allergens, and better defini¬ 
tion of regions, & more clear-cut gain was reported, reaching over 50%. 
The experience gained in the whole series of studies conducted indicates 


that incomplete block designs of the type cited are promising where 
large numbers of similar treatments are to be compared. 
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THE GENERAL THEORY OF PRIME-POWER 
LATTICE DESIGNS* 

IX. DESIGNS FOR p" VARIETIES IN BLOCKS OF p* PLOTS, AND IN SQUARES 
0. KeMPTHORNE AND W. T. FEDERERf 


INTRODUCTION 

An earlier paper [4] contained the basis of the design and analysis of 
lattice (or quasi-factorial) designs for a number of varieties (p n ) which 
is the power of a prime number (p), with a discussion of arrangements in 
p” -1 blocks of p plots. The procedure consisted entirely of relating the 
p" varieties to the combinations of n factors, each having p levels, and 
utilizing the concepts of effects and interactions of these factors [3] in 
both the design and analysis. 

The purpose of the present paper is to give designs for p n varieties in 
blocks of p* plots, where s is an integer greater than unity, and designs 
which utilize the Latin Square and split-plot principles. There will be 
little need to discuss the analysis of these designs, as this follows directly 
from the general formulation given in the earlier paper. It is intended 
however to present later more detailed numerical descriptions of the 
analyses of designs which appear to be of considerable practical value. 

DESIGNS FOR p n VARIETIES IN BLOCKS OF p* PLOTS 

The use of blocks of size p for p n varieties necessitates at least n 
replicates, if intra-block information on all effects and interactions for 
the corresponding factorial scheme is to be obtained. In the case of 
512 varieties for example, correspondence with the 2® factorial system is 
■ established and with blocks of 2 plots at least nine replicates are required. 
In many cases it is impossible for the experimenter to use n or more 
replicates, and it is then necessary to use blocks of size p a or p 8 . If at 
least n replicates are to be used, there seems little point in using a design, 
with blocks of p a rather than blocks of p plots. It is intended to present 
here some considerations on this aspect of the problem. 


♦Contribution of the Statistical Section of the Iowa Agricultural Experiment Station in coopera¬ 
tion with the Bureau of Agricultural Economics, United States Department of Agriculture. Journal 
paper no. J 1654, Project 890. 

fAssociate Professor, Statistical Laboratory, Iowa State College and Associate Agricultural 
Statistician, Bureau of Agricultural Economics collaborating with the Iowa Agricultural Experiment 
Station, respectively. 
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For purposes of illustration designs for 3 4 (=81) varieties will be 
considered first. With blocks of 3 plots, at least 4 replicates must be 
used, but with blocks of 9 plots the minimum number of replicates is 
two. With blocks of 9 plots 8 degrees of freedom must be confounded 
in each replicate and, denoting the pseudo-factors by a, 6, c, d, the best 
scheme of confounding for two replicates is to confound A, B, AB, AB 2 
in one replicate and C, D, CD, CD 2 in the other. This is equivalent to 
regarding the experiment as a 9 2 experiment with two pseudo-factors 
X t Y each with 9 levels, and confounding X in one replicate and Y in the 
second.. From the general formula given earlier, the mean variance of 
varietal comparisons in this case will be 

— ( - . S2\ 2 / 1 _4_ 

40 \w + w 2wj 5 \w + w' 2iD, 

If three replicates are to be used, the confounding in each replicate may¬ 
be chosen so that no contrast is confounded in more than one replicate; 
for example by using the above two replicates and a third in which 
ABC, AB 2 D, AC a D 2 and BC 2 D are confounded, and in this case the 
mean variance of varietal comparisons will be 

2_ f 12 2§\ 1 f 3 J7_l 

x 40 \2u> + w' 3 wf 5 \2w + w' 3ioJ' 

Both of the above results are given by Cox, Eckhardt and Cochran [2] 
with a detailed description of the analyses of the two experiments. These 
examples do not illustrate all the principles involved because they may 
be described as simple and triple lattices respectively for k a varieties in 
'blocks of k plots, k being 3 2 . 

. To illustrate the problem in full generality the case of 2 S (=32) 
varieties in blocks of 2 S (=4) plots will be described. The corresponding 
factorial scheme consists of five,factors which will be denoted by a, b, c, 
d, e, each having two levels. In order to reduce the size of block to 4 
plots, it will be necessary to confound seven degrees of freedom in each 
replicate. The following is a suitable scheme for three replicates: 

replicate 1 confounding A, B, AB, C, AC, BC, ABC ; 

“ 2 “ A, C, AC, D, AD, CD, ACD; 

“ 3 “ B, D, BD, E, BE, BE, BDE-, 

The analysis of variance for this experiment will be as follows: 
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TABLE 1 

ANALYSIS FOR 2 6 VARIETIES IN BLOCKS OF 4 PLOTS 


Expectation of 
DF Mean Square 


Replicates. 2 

Blocks adjusted for varieties 
Comparison of effects and interactions between 
the two replicates in which they are confounded. 5 <r? + 4<r? 

Comparison of effect or interactions in the two 
replicates in which they are confounded with 
corresponding effect or interactions in replicate 

in which they are unconfounded. 5 <rf + 1/3* 4o| 

Comparison of effect or interaction confounded in 
one replicate with mean unconfounded effect in 

the other two replicates. 11 <rf + 2/3*4 <rl 

Varieties ignoring blocks. 31 

Intrablock error .. 41 <r? 

Total. 95 


The effects and interactions A, B, C, AC, and D on a per-plot basis 
will be determined with variance 1/8 (v> + 2w') } effects and interactions 
E, AB, BC, ABC, CD, ACD, BD, BE, DE, and BDE with variance 
l/8(2w + w f ) and the remaining interactions with variance 1/8(3w), 
where w = \/a\ and w' = 1/(<r? + 4 a$). The mean variance of varietal 
comparisons will be, by formula of the earlier paper [4], 

JL { 5 JL 10 j_ i§\ 

31 V + + 2w + u>' 3 toy 

The total number of different schemes for a p n system in blocks of p* 
plots is equal to 

( v n - D(P n - P)(V n -p 2 ) y— 1 ) ' i- 

(*r‘ - D(p n “‘ - pW- - V 2 ) • • • * (p n -‘ - tf— x ) ; I 

and the practical problem is to choose out of all thesp schemes, a number 
of schemes, one for each replicate, in such a way that the confounding^ 
is distributed as equally as possible over all the effects and interactions/ 
Suitable schemes for p 2n varieties in blocks of p n plots are directly ob¬ 
tainable from completely orthogonalised squares of side p n , which are 
published in the literature or may be generated by the method described 
by Stevens [6]. Suitable schemes for three replicates of p** varieties in 
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blocks of p n plots are obtainable as described by Yates [7] for the three-? 
dimensional lattice. The following schemes for other cases seem worthy 
of mention (there is little point, for example, in giving a design for 2 T 
(=128) varieties in blocks of 4 or 8 plots, since designs for 5®(=125) 
varieties in blocks of 5 plots are likely to be quite satisfactory), generators 
only of the schemes of confounding being given: 

32 varieties in blocks of 8 plots: 
replicate 1, A, B, 

“ 2, C, D, 

“ 3, AC, DE, 

the efficiency factor being 87 percent. 

256 varieties in blocks of 8‘plots: 
replicate 1, A, B, C, D, E, 

• “ 2, D, E, F, G, IF, 

“ 3, A, B, C, F, G, 

the efficiency factor being 81 percent. 

243 varieties in blocks of 9 plots: 
replicate 1, A, B, C, 

“ 2, A, B,D, 

“ 3 , C,D,E, 

’’ the efficiency factor being 90 percent. 

There is also available a wide range of designs for p" variates using 
tike split-plot analogy. Thus with 4 pseudo-factors a, b, c and d, it is 
possible to confound A with blocks of p® plots, B with blocks of p* 
within blocks of p®, C with blocks of p within blocks of p®, D being un- 
confounded. In this case, four replicates give a design with reasonable 
balance. These designs will not be discussed in detail here, because the 
split-plot principle is used to the best advantage when whole plots are 
Subject to two restrictions. An example of this type is given later. 
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THE COMPARISON OF LATTICE DESIGNS WITH DIFFERENT SIZES OF BLOCK 

Cox, Eckhardt and Cochran [2] gave the analysis for 9 2 varieties in 
blocks of 9 using three replicates such as those given above, and these 
replicates were duplicated. The approach of the present papers suggests 
a different design for this particular case, since the 81 varieties may be 
regarded as making up a 3 4 system and blocks of 3 plots may be used. 
It would be possible to use the following confounding 

replicate 1: A, B, C, and all their interactions 

: 2: A, B, D, “ “ “ “ 

3: A, C, D, “ “ “ 

4:J3, C, Z>, “ “ “ 

5: AB, AC, AD, “ “ “ “ 

6 : AB 2 , A'C'\ AD', “ “ “ 

The mean variance of comparisons with this design is 

J J _3__11__9 15 2.1 

20 \2w -f- 4w' ~ 3w + 3 w' ~ 4w + 2w' . 5w + w' ' 6w,‘ 

For various values of w/iu', the efficiency of each of these designs relative 
to complete randomised blocks is as follows: 

w/w 1 

8 9 10 

3 replicates of 9 s ...... 100 104 111 118 126 134 143 151 160 168 

6 replicates of 3 4 . 100 110 126 144 163 183 202 222 242 262 

The ratio w/w' will not be the same for both designs laid out on the same 
land, since in the upper case the relevant variances are those within 
blocks of 3 and between blocks of 3 amongst 81 plots, while in the second 
case the relevant variances are those within and between blocks of 9 
plots. 

The empirical law of soil heterogeneity obtained by Fairfield Smith; 
[5] may be used to obtain a partial answer to this question. He postu¬ 
lated the relationship 

.. . y * -t,m . 

where V x is the variance per unit area with plots of size x, Vi m the vari¬ 
ance per unit area with plots of unit area, and b is a parameter depend- 
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ing on the extent of heterogeneity in the experimental area. He further 
obtained the relationship 

(' V r )n _ (m — 1 )(n — n l ~ b ) 

(V x )m (n — l)(m — 


to give the expectation of the relative efficiency of a randomised block 
experiment with m plots relative to one with n plots per block, (F x ) m 
being the variance of a mean per unit area with m plots of size x units 
per block, and (V x ) n being defined similarly. 

With these relationships the relative values of the ratios w/w' for 
blocks of 3 and blocks of 9 within a replicate of 81 plots may be obtained. 
Consider the analysis given in Table 2. 

TABLE 2 

ANALYSES OP VARIANCE FOR BLOCKS OF THREE AND BLOCKS 
OP NINE WITHIN A BLOCK OF 81 PLOTS 


Degrees Mean Sum of 

of freedom square squares 


Total .............. 80 a . 80a 

Blocks of 3 

Among blocks. 26 5 265 ® 80a — 64/3 

Within blocks .. 64 0 54/3 

Blocks of 9 

Among blocks .. 8 « 8e « 80a - 72y 

Within blocks. 72 y 72y 

Among blocks of 3 

within blocks of 9. 18 X 18X » 72X - 54/3 


Mow, the sums of squares 80a and 72 y may be computed from the fol¬ 
lowing relations—respectively, 


a ^ (K)si _ JL / 81 - _ 54 

8 ( V x \ ~ 80 \ 3 _ z x ~ b J 80 


U + 3“‘ + 3““ + 3- 86 }, 


7 (yjc 2/ 9 - o 1 -* ! 

8 ( 7 ,), 8 \ 3 - 3 1 - 6 / 


* | {1 + 3 - 4 }, 


j3 may be taken equal to unity, and the quantities a, y, 8, and e may be 
evaluated in terms of b. For various values of b the ratio w/w' may be 
computed for blocks of 3 and blocks of 9 within a replicate of 81 plots. 
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It was found that w/V(== */y) for blocks of 9 plots was approximately 
1.55 times w/w '(= 5//3) for blocks of 3 plots within a replicate of 81 plots. 

The quantity b ranges from zero to unity, unless these are negative 
correlations between plots in a block (or competition) when it can exceed 
unity by a small amount depending upon the size of the block. The 
maximum value that w/w / can take for blocks of 3 is (for 0 < b < 1) 
81/13, and for blocks of 9 it is 9. In practice, however, there appears to 
be no limit to the value for w/w f for any design. More information is 
needed then to obtain a complete answer for the relative values of w/w r 
for the two designs, incomplete blocks of 3 and of 9. Fairfield Smith was 
unable to verify his law for small values of 6, and it appears that the 
relationships postulated by him are not very accurate for values of b 
less than 0.2. Extensive uniformity trial data would need to be exam¬ 
ined before a relationship could be postulated for small values of b. 
The above suggests however that for most types of soil heterogeneity 
the design with blocks of 3 plots will yield more information than that 
with blocks of 9 plots. 


DESIGNS WITH TWO RESTRICTIONS 

Lattice designs for a number of varieties k 2 , with two restrictions 
have been described by Yates [8] who has called them lattice squares, 
and by Cochran [1]. Such designs are based on completely orthogonal- 
ised Latin Squares of side k, and since such squares exist when A- is a 
prime number or a power of a prime, the present treatment may be 
extended to this case. If for example the number of varieties is 25, the 
effects and interactions may be represented by A , B, AB, AB 2 , AB 8 and 
AB 4 , each with 4 degrees of freedom, and it is possible to form squares 
such that each effect or interaction is confounded, with the rows or with 
the columns of one square. For the semi-balanced design A is con¬ 
founded with rows and B with columns in one square, AB with rows and 
AB 2 with columns in a second square, and AB* with rows and AB 4 with; 
columns in a third square. The information on effects A , AB, and AB*' 
will be 5 2 (2 w + w r ), and on effects A,AB 2 , AB 41 , it will be 5 2 (2w + 
where w is the reciprocal of the intra-row and jmtra-column variance 
and w r and w 0 are respectively the reciprocals of the inter-row and inter¬ 
column variances, If the intra-row and intra-column variance is <r? and 
the additional variance between rows is oy and between columns <rf, 
w will be equal to 1/<r?, w T to l/(<r 2 + poy) and w 0 to l/{<r* + pay). The 
mean variance of varietal comparisons will be, by analogy with results 
in the earlier paper, 
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2 / 3 . 3 . j / 1 .JL_ l 

6 \2w + Wr^ 2w + wj \2w + w r 2w + wj 

It is dear furthermore that the use of two squares only is a valid design: 
if for example the first two of the above are chosen, the mean variance 
of varietal comparisons will be 

i f. 2 +_ + ±\ 

6 lie + w r w + w c 2w) 

— ? /-1-1-1-_Ll 

3 l,io + w r w + w c 2w) 

If 6(=p + 1) squares are used, each effect may be confounded with 
rows and with columns, and the mean variance per comparison is 

2 / 6 V 2 

6 \4i0 + w r + wj 4u> + Wr + w c ' 

The analysis of the completely balanced lattice square has been 
given by Yates [8] but it seems worth while to describe it briefly in 


terms of the concepts used in the present paper. 

Taking for example the 

case when p = 5, there will be 6 squares in all, the confounding being for 
example: 

Confounded with 

Square 

Rows 

Columns 

/, .1 ■ 

A . 

B 

' 2 ' . 

AB 

AB 2 

3 

AB 3 

AB* 

* '• 

B 

A 

5 

AB 1 

AB 

6 

AB 4 

AB 8 


It may be noted that with a particular factorial correspondence more 
than one system of confounding is possible. Estimates of the A effect 
are obtained from square 1 with a variance based on (o'* + 5ov), square 4 
with a variance based on (<rj + 5v«) and from each of the remaining 
squares with a variance based on a]) and these estimates are combined 
weighting inversely as their variances. The estimation of ■(«•? + 5trJ) is 
performed by noting the comparison of the A effect in square 1 with the 
average A effect in squares 2,3,5 and 6 will contain only intra-row and 
lotr%-odumn and row errors. In fact, the contribution of square 1 to 
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the sum of squares for rows, eliminating varieties and columns is 

jqq ^22 (4(-^)<i — (A)jj + (A)(s •+■ (»4)v 5 + (A) <8 ) 

_ (4 X total of square 1 — sum of totals of squares 2, 3, 5, 6) ! 

5 

where (A),-,- is the total of plots at level i, in square j. The contribution 
of square 1 to the sum of squares for columns eliminating treatments is 


i 

160 



(5(R)„ - (fl)„ + (J3)„ + (B)u + (B) is + (B) ia f 


(5 X total of square 1 — sum of totals of squares 2, 3, 4, 5, 6) ! 

5 


The identity of these expressions with those given by Yates is not im¬ 
mediately obvious and his method is the more expeditious computation¬ 
ally. The identity of the adjusted variety means by the present treat¬ 
ment with those given by Yates is easily verified. Corresponding formu¬ 
las for designs in which only a selection of the total (p + 1) squares are 
used may be written down at sight of the structure of the design. 

The restriction of the term lattice squares to arrangements of k 2 
varieties in squares of side k, when a completely orthogonalised Latin 
Square of side k exists appears unduly restrictive. In the case of number 
of varieties k 2 where k is quite general except that a Latin Square of side 
k exists, it is possible to utilize the effectiveness of the Latin Square 
arrangement in controlling heterogeneity. The k by k Latin Square 
gives three orthogonal groupings of the k 2 varieties in k groups of k, by 
rows, columns and letter. If these groupings are denoted by a, 3, and 
7 respectively it is entirely feasible to use three k by k squares or a mul¬ 
tiple of these with the following confounding: 


Confounded with 
Rows 
Columns 


Square I Square II Square III 

« 3 y 

3 7 a 


The advantages of lattice square arrangements relative to lattice 
arrangements depend to some extent on the shape of plot to be used (1). 
In com yield tests it is customary to use plots of size 2 by 10 hills and 
fpr this shape, of plot lattice square arrangements are not likely to yield 
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much greater precision than lattice arrangements, in addition to the fact 
that they result in awkward shaped replicates. If however it is possible 
to use plots which are more nearly square the lattice square arrangement 
will almost certainly be more advantageous. Under such circumstances 
the design given above would appear to be appropriate. This design 
may be called an unbalanced lattice square, the situation being analogous 
to the case of lattice designs with one restriction, in that simple and 
triple lattices are always possible for Jc 2 varieties but that the balanced 
lattice exists only if a completely orthogonalised Latin Square of side k 
exists. 

UTILIZING THE LATIN SQUARE AND SPLIT-PLOT PRINCIPLES 

A detailed enumeration of all possible designs will not be given, but 
the following are simple extensions. 

If p 8 varieties are being tested, they may be represented formally as 
the combinations of three factors a, b, c and factors a and b may be 
imposed as whole-plot treatments and c as a split-plot treatment. The 
whole-plot factors may be estimated by the use of lattice squares with at 
least two replicates, each plot of the squares being split for the factor c. 
In these cases information will be of four types 

(a) inter-row with variance l/w r , 

(b) inter-column with variance l/w c , 

(c) intra-row and column with variance 1/re?, 

(d) intra-whole plots with variance l/w 9 , 

If for example each of the factors has 5 levels, a suitable set of three 
replicates is given by * 

Confounded with Lattice Square 



Rows 

Columas 

plots split for 

Square 1 

A 

B 

C 

2 

AB 

AB 2 

C 

3 

AB 3 

AB 4 

c 


For the estimation of the effects A, B and interactions AB, AB 3 , AB Z , 
AjB 4 , the procedure is the same as if the plots of the square were not 
split. An estimate of the split-plot error <r 2 (== l/w s ) may be obtained 
from the interaction with squares of all two and three factor interactions 
involving c. The mean variance of varietal comparisons will be 

2 / 3 3 25 

■,, 31 \2w + w r _ 2 w + w* 3w, 
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It is possible to use a design similar to the above with only two repli¬ 
cates when the number of varieties is p 2 q where q is not equal to p and 
need not be a prime. In such a case, however, while the analysis will 
be of a similar structure, the mean variance of varietal comparisons is 
not quite as obvious, and as the purpose of the present paper is to deal 
with prime-power lattice designs, it will not be discussed here. 

It is quite probable that with such a layout w, and w 0 will be very 
small and w will be small in comparison with w,. In order to improve 
the accuracy of the experiment, the following scheme may be followed. 

- Confounded with Lattice Square 
Replicate Rows Columns plots split for 

I ABC 

II C A B 

III B C A 

With such a design the variance of the A, B, and C effects will be 
l/p 2 (w r + w c + w,), the variance of the interactions AB, AB a , AB 3 , 
AB*, AC, AC 2 , AC 2 , AC*, BC, BC 2 , BC 3 , and BC* will be l/p> + 2w.), 
and for all interactions involving three factors the variance will be 
l/p 2 (3u>„). The mean variance of varietal comparisons will be 

ii _ — —i—+ *4 

31 {w r + w e + w a w + 2 w t 3 w a ) ' 

The analysis will follow the general lines of l!he present and previous 
paper [4]. The weights for row comparisons, column comparison, whole 
plot and split-plot comparisons may be obtained from: 

The mean square for rows eliminating varieties and columns 
and whole plots, which is obtained from the comparison of main 
effects in squares in which they are confounded with rows with 
the same main effects in squares in which they are uncon¬ 
founded with rows, columns or whole-plots: 

The mean square for columns, eliminating varieties and rows 
and whole plots obtained likewise: 

, The error mean square for whole plots, obtained from the 
comparison of two-factor interactions in the squares in which 
* they are confounded with whole-plots with the same interac- 
actions in the other two squares in which they are uncon¬ 
founded; 

The split-plot error mean square, obtained from the comparison 
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of two-factor interactions amongst squares in which they are 
unconfounded, and the interaction of the three-factor inter¬ 
actions with the three squares. 


This design appears to be eminently suited for com breeding work in 
which the basic plot is long and narrow. It is customary to use plots of 
size 2 by 10 hills and the arrangement of p of these (for small p) in a 
whole-plot would result in whole plots that are more or ‘less square and 
the full advantages of the Latin Square control of row and column 
effects would be utilizable. 

Alternatively to the above split-plot design for p 8 varieties, we may, 
as Yates [9] pointed out, divide the varieties into p groups of p 2 varieties 
and test each group of p 2 varieties with p X p lattice squares, of which 
only two are absolutely necessary for each group. The division into p 
groups of p 3 varieties may be made by choosing one effect or interaction 
to be confounded with groups in each replicate, and a large number of 
possible groupings are available. If the pseudo-factors are denoted by 
a, b, c , the possible replicates are obtained by choosing one effect or 
interaction to be confounded with squares and other interactions to be 
confounded with rows and columns within squares: if the factors each 
have 5 levels, for example, the following are 9 suitable replicates; 

Confounded with 


Squares 

Rows 

Columns 

A 

B 

C 

A 

BC 

BC 2 

A 

BC Z 

BC* 

B 

C 

A 

B 

AC 

AC 2 

B ' 

AC 3 

AC 4 

C 

■ A 

B 

C 

AB 

AB ? 

C 

AB* 

AB* 


Tins table could be extended (a) by interchanging rows and columns and 
(b) by confounding between squares each of the possible 31 effects and 
interactions. The minimum number of replicates which must be used is 
four, whatever the value of p, but such a design would have a low rela- 
tive,efficiency. Information in such a design consists of: 

\ (a) among squares, 

J (b) among rows within squares, 



(e) among columns within squares, 
(d) : within row and columns, 
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and the relative efficiency may be evaluated in terms of the variances of 
these types of information. As stated by Yates [9] the efficiency factor 
of the design given above (the ratio of the mean variance of varietal 
comparisons in complete randomised blocks to the mean variance in the 
design when information other than within rows and columns is assumed 
to be valueless and when the error variance is assumed to be the same 
in both designs) is 

(p ~ 1 )(p 2 + P + 1) 

(p + i )(p 2 + p + 2 \y 

This factor is obtained by noting that the three main effects will be 
determined with variance l/(p — 1 )w and the remaining (p 2 + p — 2) 
effects and interactions with variance 2/3 (p — 1 )w so that the mean 
variance of varietal comparisons would be 


2(p - 1) J 

r 3 2(p 2 + p — 2)1 

L_ id 

f . (p 2 + p + 2*) ] 

(P*- 1 ) 1 

l(p — l)w 3(p - l)w J 

1 3w 

l(p 2 + v + i)(p — 1)J 


compared with 4/3 (p + 1 )w with complete randomised blocks and 
the same error variance. 

V --A fur^er'^extemion is the testing of p 4 varieties involving pseudo 
factors a, b, c and d in which factors a and b are applied in a lattice 
square arrangement, the plots being split for factors c and d which are 
also applied in a lattice square arrangement. 
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ASSAYS OF INSULIN WITH 
ONE BLOOD SAMPLE PER RABBIT PER TEST DAY 

D. M. Yotjnq and R. G. Romans 

Connaught Medical Research Laboratories 
University of Toronto, Toronto, Canada 


Biological determinations of potency are subject to a sampling error 
internal to each assay and also to variation between independent assays. 
The first component, sometimes designated as s M , is readily computed 
from the data of a single self-contained assay. To evaluate the second 
component requires two or more independent experiments. It is usually 
assumed to be negligible and the reliability of an assay is estimated" from 
its internal variability. The validity of this assumption can be checked 
only by reassaying a single unknown independently on two or more 
occasions. As results are reported with a given technique the confidence 
to be placed upon its internal error can be assessed. 

During 1944 an investigation [1] was made of the efficiency of a 
number of procedures for use in the rabbit assay of insulin. The most 
promising of these required only one sample of blood for glucose analysis 
from each rabbit on each test day. During a year’s experience with this 
single-blood-sugar method twenty-one insulin samples of unknown po¬ 
tency were assayed against the standard and seven samples of unknown 
potency with other samples of unknown potency. The samples had 
! been prepared from a variety of sources (beef, pork and lamb) and varied 
in purity. Since the number of assays conducted on any one sample 
ranged from two to eight, it has been possible to compare the two sources 
of error. 


METHOD 

The rabbits used in the assays were of mixed stock; they were free 
from disease so far as could be determined by a superficial examination;' 
and, except for the occasional animal, they ranged in weight from 1.6 to 
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3.0 kg. -The rabbits were fed Purina Rabbit Chow (Complete Ration)' 
and water was available to them at all times. Before the rabbits were 
injected with insulin, they were starved for a 16 to 18 hour period. The 
specified quantities of insulin were then injected in 2 ml. of TJ.S.P. XII 
[2] diluting fluid, using the marginal ear veins as the route of injection. 
Fifty minutes after injection of insulin, slightly more than one ml. of 
blood was taken from a marginal ear vein. Blood glucose in mgm. per¬ 
cent was determined by the method of Nelson [3] and the values obtained 
were used directly in computing the results. Rabbits placed on test 
on successive days were starved for the length of time specified, injected 
with insulin, bled as described, and then given access to food for 4 to 5 
hours before being starved for the next day’s treatment. 

Between February, 1945, and March, 1946,102 twelve-rabbit assays 
of insulin were carried out using the technique described. Assays were 
conducted each month during this period. The design, employing three 
separately randomized latin squares, was patterned after that outlined 
by Bliss and Marks [4], Each rabbit received one dose of insulin on each 
of four successive working days. Except when one unknown preparation 
of-insulin was being assayed against a second unknown, dose con¬ 
tained 0.60 units and dose S 2 1.2 units of Insulin Standard. 1 On the 
basis of an assumed potency for the unknown under assay, doses U x and' 
U 2 were made up to contain 0.60 and 1.2 units of insulin respectively. 
The order of injection and the resulting blood-sugar determinations in 
one assay (No. 22) are exemplified in Tables 1 and 2. Table 3 illustrates 
the calculation from these data of the potency and its error in a self- 
contained assay. 


Unsulin Standard 8230, 23.0 International Unite per 
Committee, University of Toronto). 

mgm. 

(kindly supplied by the Insulin 

TABLE 1 

ORDER OF INJECTIONS 

ASSAY NO. 22 

Rabbit 

Number 

1 

2 

3 

4 

5 

6 

7 

S 

9 

10 

11 

12 

Date 













23/4/45 

Si 

Ui 

Ut 

Si 

S* 

Vi 

Ut 

Si 

Vt 

Ut 

Si 

" & ' 

25/4/45 

Si 

El 

Ui 

X' St 

Vt 

Si 

St 

Ui 

Ut 

Ui 

Si 

Si \ 

26/4/45 

Ui 

Si 

S s 

Vt 

Si 

Ut 

Ui 

St 

St 

Si 

Vi 

Ut 

27/4/45 

Ut 

St 

Si 

Vi 

Ui 

St 

Si 

Ut 

Si 

St 

Vt 

Vi 
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TABLE 2 

RESPONSE-EXPRESSED AS MG. PERCENT OF BLOOD SUGAR 
ASSAY NO. 22 


Rabbit 

Number 

1 

2 

3 

4 

5 . 

6 

7 

8 

9 

10 

11 

12 


Date 













Total 

23/4/45 

27 

54 

48 

63 

24 

50 

34 

48 

61 

72 

68 

28 

577 

25/4/45 

36 

40 

72 

50 

33 

58 

56 

60 

58 

83* 

62 

65 

673 

• 26/4/45 

54 

61 

60 

50 

57 

26 

59 

46 

40 

75 

68 

54 

656 

27/4/45 

38 

36 

60 

59 

46 

35 

61 

47 

54 

62 

50 

56 

604 

Total 

155 

191 

240 

222 

169 

169 

210 

201 

219 

292 

248 

203 

2,510 


Total for Doses 

Dose Si Sa Ui Us 

Response 706 532 722 550 


The rabbits employed for the work did not appear to suffer from the 
frequent injections of insulin and, if allowed a week’s rest between assays, 
rabbits could be used 4 or 5 times. Convulsions were extremely rare. 
Out of the 4896 blood-sugar determinations which would have been 
required for a complete record only 29, or approximately 0.6%, were lost 


TABLE 3 

’ ‘ CALCULATION OF ESTIMATE OF POTENCY AND ITS STANDARD ERROR 

*rr / ASSAY NO. 22 


Treatment 

J.*.- T 

Factorial Coefficient 
(*) for log. dose 

N* 2(x») 

2 (xY r ) 

Mean Square 
2*<*K,) 


■ $t, , . 

St 

Ui 

Ut 

• N 2te») 

Samples 

-1 

-1 

+1 

4-1 

48 

34 

24.08 - D* 

Slope 

-1 

+1 

-1 

4-1 

48 

-346 

2494.08 - B 2 

Parallelism 

+1 

-I 

-1 

+1 

48 

2 

0.83 

Totals for Doses * Y p 

706 

532 

722 

550 

Error Variance («*) 

« 45.72 


*N (the number of responses at each dose level) 12 
M (the log. of the potency ratio U/S) *■ ID /B =* —0.0296 
8M (the standard error of if) “ -f- D 8 ) 172 ) /B* — 0.0409 

where X (the log. of the potency ratio Ss/Sx and Ua/Ux) - 0.301 
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for all reasons. These lost results were replaced, for computational 
convenience, by means of a formula suggested by DeLury (12) . s 

BESTJLTS 

The results of the 102 assays were used .to compute the estimate in 
Table 4 of the standard deviation (s), the slope ( b ), and the relative 
potency based upon the assay slope. The date on which the assay was 
begun, the degrees of freedom for error and the factorial difference 
between the responses on standard and on unknown are also given in this 
table. Save in five instances, there was no significant departure from 
parallel dosage-response curves for standard and unknown insulin. This 
is about the number of departures to be expected in such a series. 

The agreement of the error variances with their mean of s 2 = 53.13 
was tested with Bartlett’s equation for x [5]. The resulting x (258.4; 
n = 101) indicated a highly significant heterogeneity among the vari¬ 
ances obtained from the different assays. The standard deviation (s) 
of an individual blood-sugar determination computed from the average 
' variance was 7.29. 

The standard deviation (s) for each individual assay has been plotted 
against the date it was started (Control Chart). The value based on the 
average variance has been plotted as a solid horizontal line and the limits 
expected to enclose 95% of the observations as parallel broken lines. 
No adjustment has been made in the limits for the occasional assay with 
fewer than 30 degrees of freedom. Of the 102 assays considered, 23 had . 
standard deviations which fell outside of the control limits. Visual in¬ 
spection of the control chart might suggest that the standard deviation^ 
tended to increase in magnitude from February to October of 1945. 
Such a trend could be due to seasonal variation in the experimental 
animals or to some unknown factor having to do with the experimental; 
technique. 4 

The mean square for the interaction of Slope X Assay was 57.87 wit?! 
101 degrees of freedom, which is not significantly different, by the usual, = 
“F” tests, from the average error variance. This would suggest that the 
slope for the logarithm-dosage response curve remained stable over t®#* 

s If the estimate of the value to he replaced be designated as 7, then 

4Ti + 42\ + 4rT« - 2T 
' y - ,--- ' 

12r — 6 \ / .. 

where T \, T% and To are the respective totals for the row, treatment, and column In which the observa¬ 
tion is missing; T is the total for all response? in the experiment from whieh the ohservation is misarigt;'* 
and r is the number of 4 X 4 latin squares used in the experiment. - 1 
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DAY ON WHICH ASSAY WAS BEGUN 
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period. The average slope of — 4L44 was used in recomputing the 
potency of each assay. No correlation was found between the values of 
s and of 5 competed separately from the 102 assays. 

The variation in independent estimates of potency of the same prep¬ 
aration was assessed from the factorial differences in Table 4. The mean 
square between replicates was 46.18 with 74 degrees of freedom. Since 
this was less than the average error variance within assays, there was 
no evidence of a variance component between assays, in addition to that 
observed within assays. 

The expected precision of an assay (s M = s/b (1/12) 1/2 ) [see ref. 6], 
using the average variance and slope, was found to be 0.0508. The 
corresponding quantity computed from the variation among results of 
replicated assays was 0.0488 when the individual slopes were used to 
compute the estimates of potency, and 0.0473 when these estimates were 
computed with the average slope. The standard error of the estimate 
of potency was therefore of the order of 12%. Since the estimates of 
error from the two sources were in good agreement, the internal evidence 
of the assays provided a satisfactory basis for estimating the variability 
to be expected in the results of replicated tests. An average slope might 
be used to advantage in computing estimates of potency and its pre¬ 
cision. In view of the lack of homogeneity among the variances, one 
might prefer to use the individual estimates of error. 


DISCUSSION 

Previous workers, employing somewhat more complicated bleeding 
techniques, have used adequate statistical methods to estimate the" 
expected error of their insulin assays. Bliss and Marks [6], using eight 
pure-bred Himalayan rabbits and bleeding each rabbit six times on each 
test day, estimated the standard deviation of an estimate of potency from 
a twelve-rabbit assay at 7.9%. Smith, Marks, Fieller and Broom [7], 
using six bleedings per rabbit per test day and an alternative experi¬ 
mental design, obtained results with an estimated standard deviation of 
about 12.5% for a twelve-rabbit assay. 

From the results of the 1944 investigation [1] the single-blood-sugar 
method would be expected to yield a more precise or more reproducible 
result for a given amount of labour than the more conventional pro-' 
cedures [8,9]. In this connection, it is of interest to note a recent com¬ 
munication from Bliss and Bartels [10]. After working over the data 
obtained by Bliss and Marks [4], these workers have concluded that the 
number of blood samples taken per rabbit per test day might be reduced 
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TABLE 4 

RESULTS FROM 102 TWELVE-RABBIT ASSAYS 


Assay 

Number 

Assay 

Date 

Insulin 

Sample 

Number 

d.f. 

(error) 

a 

-6 

Factorial 
Difference 
(S *Fp) 

Estimate of 
Relative 
Potency 

1 

6/2/45 

740 

30 

7.88 

48.59** 

67 

0.88 

2 

“ 

44 

30 

6.75 

35.71 

10 

0.97 

3 

“ 

44 

30 

5.72 

44.16 

9 

0.98 

4 


“ 

30 

6.10 

49.14 

11 

0.98 

5 

44 

14 

30 

4.49 

47.76 

-33 

1.07 

6 

12/2/45 

44 

30 

5.94 

48.03 

-19 

1.04 

7 

41 

44 

30 

5.28 

44.16 

29 

0.94 

8 

41 

44 , 

30 

4.77 

34.05 

-16 

1.05 

9 

16/2/45 

1990 

30 

7.88 

41.11 

117 

0.76 

10 

41 

44 

30 

8.22 

47.34 

-12 

1.02 

11 

44 

44 

30 

5.93 

46.51 

112 

0.79 

12 

28/2/45 

744 

30 

3.52 

34.47** 

-1 

1.00 

13 

44 

44 

30 

6.63 

39.04 

36 

0.91 

14 

44 

44 

30 

6.82 

47.62 

60 

0.89 

15 

“ 

44 

30 

7.92 

43.60 

-25 

1.06 

16 

5/3/45 

44 

30 

5.45 

41.12 

15 

0.97 

17 

• 44 

44 

29 

5.60 

45.90 

26 


18 


44 

30 

7.57 

50.39 

62 

0.89 

19 

“■* 


30 

6.39 

45.82 

-19 

1.04 

20 

23/4/45 

2056 * 

30 

6.72 

45.54 

-9 

1.02 

21 

44 

44 

30 

5.57 

32.81 

-17 

1.05 

22 

“ 

44 

30 

6.76 

47.90 

34 

0.93 

23 

30/4/45 

2062* 

30 

8.12 

51,36 

-21 

1.04 

24 

44 

“ 

30 

8.46 

40.14 

-80 

1.21 ' 

25 

44 

41 

30 

8.14 

44.85 

12 

0.97. 

26 

4/5/45 

2073* 

30 

5.87 

43,74 

42 : 

0.91 

27 

«« 

, 44 

30 

10.01 

36.96 

105 

0.76 

28 

*4 

u.' 

30 

7.15 

28.24 

32 

0.90 

29 

11/5/45 . 

675-1 

29 

6.65 

46.79 

72 

0.86 

30 


a/ 

29 

4,91 

44.57 

30 

0.94 


44 


30 

5.34 

26.99** 

135 

0,62 

32 

17/5/45 

2087* 

30 

7,62 

36.54 

-40 

1.11 

; ' 

u 

“ 

30 

8.99 

35.30 

83 

0.80 

", 34 ’ 

*< 

. 411 

30 

6.13 

41.25 

-44 

1.11 

35 

23/5/415 

747 

29 

5.46 

43.74 

-24 

1.05 

86 

44 


29 

7.81 

46,10 

139 

0.75 

87 

44 

44 

30 

5.18 

34.47 

45 

0,88 

38 

44 

44 

30 

8.64 

43.60 

-23 

1.05 

89 

30/5/45 

44 

30 

6.63 

31.70 

5 

0,98 

40 

44 

14 

30 

8.23 

37.65 

50 

0.88 

41 

44 

44 

29 

8.31 

28.93 

27 

0.91 

42 

44 

44 

28 

5.16 

43.74 

-42 

1,10 

: 43 . • 

5/3/45 

2098* 

30 

6.44 

48.73 

-1 

1.00 

s*,- v ; 


** 

30 


33,78 

-34 

1.10 
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TABUS 4 —-Continued 


Assay 

Number 

Assay 

Bate 

Insulin 

Sample 

Number 

d.f. 

(error) 

' 3 

-6 

Factorial 
Difference 
(JB xY p ) 

Estimate of . 
Relative 
Potency 

45 

11/6/45 

2104* 

30 

8.51 

56.34 

51 

0.92 

46 

44 

44 

30 

7.91 

42.50 

-31 

1.07 

47 


44 

30 

7.26 

45.54 

-7 

1.01 

48 

l§/6/45 

747-00 

30 

6.59 

41.94 

-29 

1.07 

49 

41 

44 

30 

6.41 

40,00 

5 

0.99 

50 


4 ‘ 

30 

6.47 

35.16 

-22 

1.06 

51 

22/6/45 

. 714-7 

30 

5.20 

42.08 

50 

0.89 

52 

“ 

44 

30 

8.76 

46.51 

-96 

1.22 

53 

“ 

“ 

27 

8.71 

37.79 

33 

Q.92 

54 

31/7/45 

714-8 

27 

9.78 

45.54 

93 

0.82 

55 

44 

44 

29 

6.10 

33.22 

17 

0.95 

56 

“ 


30 

6.65 

46.10 

25 

0.95 

57 

S/8/45 

714-9 

26 

6.34 

38.76 

28 

0.93 

58 

44 

“ 

30 

7.48 

34.74 

21 

0.94 

59 

" 

44 

30 

6.11 

30.73 

-16 

1.05 

60 

20/S/45 

751 

30 

10.71 

49.00 

20 

0.96 

61 

“ 

44 

28 

5.17 

40.00 

-11 

1.03 

62 

“ 

44 

26 

9.50 

38.48 

-50 

1.18 

63 

27/8/45 

14 

30 

. 6.67 

37.10 

-12 

1.03 

64 

44 

44 

29 

7.81 

48.31 

3 

,0.99 

65 

“ 


30 

8.20 

31. S4 

-36 

1.11 

66 

7/9/45 

2121 

30 

7.15 

44.44 

65 

0.87 

67 

44 

44 

30 

9.38 

35.85 

49 

0.88 - 

68 

44 

44 

30 

10.23 

47.62 

-20 

1.04 

69 

14/9/45 

2097A 

30 

6.52 

54.54 

-16 

1.03 : * 

70 

“ 


30 

4.88 

40.14 

-2 

1,00 lyd 

71 

“ 

44 

30 

8.77 

33.49 

SO 

0.79 v 

72 

20/9/45 

496-1 

30 

8.32 

40.97 

-178 

1.52 

73 

44 

“ 

30 

5.27 

35.58 

-179 

1.62 : 

74 

44 


30 

7.88 

38.62 

-79 

1.22 

75 

27/9/45 

2150 

30 

8.61 

38.90 

-115 


76 

** 

“ 

30 

10.42 

27.55 

-49 

1.19 

77 

3/10/45 

2155 

30 

6.45 

51.50 

-66 


78 



30 

7.57 

32.67 

38 

0.80 

79 

26/10/45 

2148 

29 

7.80 

40.56 

• 268 

o.M 

80 

44 

44 

30 

9.11 

49.70 

165 

0.73 

81 

44 

44 

29 

5.40 

46.23** 

180 

’ 9.69. 

82 

1/11/45 

755 

30 

8.12 

44.99 

-17 

■ ' 1.04 'V 

83 

44 

44 

30 

6.91 

50.80 

-55 

•1,11 

84 

44 

44 

30 

9.25 

25,88 

-41 

1.16 

85 

a 

44 

30 

7.16 

42.77 

-79 

1.19 
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Assay 

Insulin 

Sample 

d.f. 

Number 

Date 

Number 

(error) 

so' 

3/12/45 

2201 * 

30 

87 

u 


30 

38 

“ 

44 

29 

89 

** 


30 

90 

15/1/40 

758 

30 

91 

“ 


30 

92 

29/1/40 

“ 

30 

98 


41 

30 

9 4' 

14/2/40 

2141 

30 

95 

41 

44 

30 

90 

“ 

“ 

30 

'97 

20/2/40 

2151 

30 

OS 

“ 


30 

99 

4/3/40, 

701 

30 

100 

14 

44 

30 

101 

12/3/40 

“ 

30 

102 

“ 

44 

30 
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Factorial 

Estimate of 

it 

-b 

Difference 

Relative 



(2 xY p ) 

Potenoy 

5,50 

49.83 

-30 

1.06 

7.21 

53.99 

-42 

1.08 

8.02 

41.39 

57 

0.88 

7.54 

39.73 

-87 

1.23 

s.os 

55.92 

-124 

1.24 

0.47 

20.76 

-24 

X .12 

6.40 

45.20 

-81 

1.19 

8.10 

35.44 

32 

0.92 

0.22 

34.19 

-27 

1.08 

4.44 

34 .33 

-64 

1.20 

S.00 

42.77 

9 

0.98 

7.71 

55.79 

-7 

1.01 

7.14 

39.45 

43 

0.90 

5.49 

37.65 

-20 

1.05 

9.40 

*49.28 

—90 

1.19 

7.95 

34.05** 

-46 

1.14 

7.47 

42.30 

0 

0.99 


’“One sample of unknown potency assayed in terms of a second sample also of unknown potenoy, 
♦♦Departure from parallelism significant at 5% Leve of Probability. 

without materially affecting the precision of the assay results. Lacey 
[11] has considered the effect of reducing the number of blood samples 
per rabbit per test day but has not adopted an abbreviated bleeding 
schedule. Pugsley and Rampton [13] have compared the results ob¬ 
tained when samples of insulin from different sources were assayed by 
the single-blood-sugar method and by a modification of the U.S.P. 
procedure. They found excellent agreement between results obtained 
by the two methods and they confirmed the conclusion that the single¬ 
blood-sugar method is the more economical procedure. 

The applicability of the single-blood-sugar method of assay to insulin 
samples from different sources and of different degrees of purity is under 
study. A variety of insulin samples have been assayed both by this 
method and by the technique adapted from Lacey [8] by the United 
States Pharmacopoeia [2]. The agreement between results obtained thus 
far by the two methods has been encouraging. The single-blood-sugar 
method does not differentiate between insulins with prolonged action 
(such as protamine zinc insulin) and unmodified insulin. 

r ' SUMMARY 

Experience gained from the performance of 102 twelve-rabbit insulin 

feV 1 ' :■ w . 1 ' . 



ASSAYS OF INSULIN ’ ■ 131 

1 * 

assays has been described. In carrying out the, assays, insulin was in¬ 
jected intravenously, and one sample of blood for glucose analysis was 
drawn from each rabbit on each test day 50 minutes after injection. 
Significant variations in the slope of the logarithm-dosage response curve 
were not detected over a 14-month period, but there were significant 
fluctuations in the estimate of error. The standard error of a twelve- 
rabbit assay of the type described was of the order of 12%. 
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QUERIES 


QTJBRY: The effects of a preservative added to fresh and to wilted 
57 alfalfa silage was tried in miniature silos. The lactic acid con¬ 
centration, measured at various periods after the ensiling date, is 
given in Table 1. Is the “Remainder” in the table of analysis of variance 
~a valid estimate of error? 


TABLE 1 

LACTIC ACID (MILLIGRAMS PER GRAM OP SILAGE) AT 
SUCCESSIVE PERIODS IN ALFALFA SILAGE TREATED IN 4 WAYS 


Period 

Fresh 


Wilted 

Cheek 

Preservative 


Check 

Preservative 

1 

13.4 

16.0 


14.4 

20.0 

2 

37.5 

42.7 


29.3 

34.5 

3 

65.2 

54.9 


36.4 

39.7 ^ 

4 

60.8 

,57.1 


39.1 

38.7 

5 

37.7 

49.2 


39.4 

39,7 

Analysis of Variance 

Source of Variation Degrees of Freedom Sum of Squares 

Mean Square 

Treatment 


3 

556 


185 

Period 


4 

2974 


744 

Remainder 

12 

596 


49.7 


The assumptions underlying the analysis of variance were 
ANSWER : discussed by Eisenhart in Vol. 3, pages 1-21 of this journal 
- (March, 1947). From an examination of your data, the 
only serious question appears to be this: Are there interactions between 
the treatments and the periods of time? If so,'two difficulties arise: 
(i) the mean i ng of the main effects is restricted; and (ii) the mean square 
for “Remainder’’ is larger than the expected value of the real error. 

V In your sample, it is plain that the curved regressions on tame are 
hot the same for fresh silage and for wilted. In the fresh, the curve turns 
downward at the later periods, while in the wilted it reaches a plateau, 
l am a bit skeptical as to whether this is characteristic of the population 
—I do not find it in some other data available. However, for illustration 
it is worth testing the hypothesis of zero interactions, 
v On the assumptions that the periods correspond to equal effective 
intervals, the differences corresponding to polynomial regressions of the 
first four powers are set out in Table 2. For the convenience of the 
sp?iader; the ^-coefficients of Fisher and Yates are appended. As an 
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TABLE 2 

DIFFERENCES FOR FOUR ORTHOGONAL REGRESSION COMPARISONS 




Fresh 



Wilted 

Power of Polynomial 

Check 

Preservative 


Check 

Preservative 

Linear 

71.9 

80.8 


59.8 

43.6 

Quadratic 

-126.5 

-79.2 


-33.6 

-33.2 

Cubic 

-22.3 

4,4 


5.4 

11.8 

Quartic 

49.1 

-4.6 


-1.4 

5.1 



, Coefficients 



Sum of Squares 

Linear 

-2 

-1 0 

+1 

+2 

10 

Quadratic 

*f2 

-1 -2 

-1 

+2 

14 

Cubic 

-1 

+2 0 

-2 

+1 

10 

Quartic 

4-1 

-4 +6 

-4 

4-1 

70 


illustrative computation, the quadratic component for the fresh check is 

2(13.4) - 1(37.5) - 2(65.2) - 1(60.8) + 2(37.7) = -126.5. 

As is often the case in factorial experiments, it seems reasonable to 
assume that the cubic and quartic interactions may be estimates of the 
real error. These are calculated as follows: 

^ [(-22.3) a +•••'+ (11.3) 2 - (~ 22 -3 + • + H.3) 2 j = 6731 




(49.1) 2 + • • • + (5.1) 2 


(49,1 + ... + 5.1)' 


■] = 


26.84 


The divisors, 10 and 70, are the usual sums of squares of the coefficients 
in Table 2. The sum of these results, 67.31 + 26.84 = 94.15, divided 
by the corresponding degrees of freedom, 3 from each comparison, yields 
the estimate of error, 15.7, about one third of the mean square for 
“Remainder” in Table 1. 

The linear and quadratic effects have the following sums of squares: 


L (10) (4) 1M0 

a (-126.5 -- 33.2) 2 

Q = “M(4) = 1326 


.„r r > 


The remaining sum of squares for “Periods,” 2974 — (1640 + 1326) — 
is negligible. This adds credence to the assumption that the interactions 
of these effects with treatments are no more than random variation. 
The interaction between treatments and linear regression, ' 
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(71.9)* + • • • + (43.6) 2 
10 
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1640 = 78, 


may be partitioned into three comparisons: 


L X (F vs. W) 


f(71.9 + 80.8) - (59.8 + 43.6)1* _ 
( 10 )( 2 )( 2 ) 

r v T _ f(71.9 + 59.8) - (80.8 + 43.6)1* _ . ™ 

L X 1 ~ (10)(2)(2) ~ 


60,76 


Corresponding interactions and components for the other regressions are 
calculated in the same way, only the divisor, 10, changing to 14, 10 and 
then 70 in the successive regressions. 

The pertinent main effects and interactions are copied into Table 3. 

* TABLE 3 


PERTINENT PARTS OF ANALYSIS OF VARIANCE 


Source of Variation 

Degrees of Freedom 

Mean Square 

Treatments: 

Fresh vs. Wilted 

1 

533 

.Other comparisons 

2 

11 

Periods: 

Linear 

1 

1640 

Quadratic 

1 

1326 

Higher degree polynomials 

2 

. 4 

Interactions: 

Fresh-Wilted X Linear 

1 

61 

Fresh-Wilted X Quadratic 

1 

345 

Others with linear and quadratic polynomials 

4 

24 

Erarpr (Interactions with higher degree polynomials) 

6 

15.7 


The suspected difference between the trends in the fresh and wilted 
silage shows up in both the linear and quadratic interactions, the latter 
being especially prominent. 

We now have evidence that the mean square for “Remainder” in 
Table 1 includes population interactions, and is therefore not an unbiased 
estimate of error. Of much greater interest is the consequence that one 
cannot draw conclusions about the effect of the wilting treatment—the 
various effects of this treatment may differ with the period of storage. 
At the beginning of the storage period and again at the end the differ¬ 
ences are small and non-significant. It is in the third and fourth periods 

that the differences are most pronounced. ~ m a 

George W. Snedecor 
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QUERY; Table 39, page 218, of Fisher's Statistical Methods for 
58 Research Workers f 8th edition, implies that the expected value of 
the variance of the subsample means is 

■ (1) A + —. 

When I go through the derivation I find it to be 

(2) -JL A A- — 

K} n-l A+ k' 

It seems to me that the thing boils down to the definition of the variance 
of the true population means from which the sub-samples have been 
drawn. In other words, the difference seems to depend on whether one 
takes the sum of squares of the true values and divides by n or by 
(n — 1). My interpretation was, of course, that one computed the 
variance of the population means by summing the squares of the devia¬ 
tions and divided by n. The results given in (1) seem to be obtained by 
dividing by (n - 1). This is particularly awkward when n is small, like 
2, where the two methods differ by a factor 2. Perhaps you will let me 
know what you think about this? 


The particular formula obtained as the result of a mathe- 
ANSWER: matical argument is, as you indicate, dependent oh the 

particular definitions adopted and the mathematical model 
set out. 

It is clear that the expected value of the variance of the subsample 
means will be 


provided that the expected value of the mean square “Between families" 
is 

kA + B. 

Let the data consist of n families with k observations in each f amily ! 
Let Xu represent the ;'th observation in the z'th class. Ass ume that the 
variability of x tj is attributable to two sources: (i) variability affecting' 
all members of the ith class equally, and (ii) variability peculiar only to 
that particular observation. 

Assume that 

X ti = m + /<+€<; , 

where m = population mean value of x u , /,• = variable changing from : 



136 


BIOMETRICS, JUNE 1948 

class to class, but constant for all members of a given class, = variable 
changing from class to class and also from observation to observation. 
Assume that the population mean values of f { and e if are zero, /,• and 
are independent and are samples from general populations whose vari¬ 
ances are A and B respectively. 

Summing over the ith class, we obtain 

Ti = Yj x u = km + kfi + Y e a ♦ 


Then, 



(a) 

ipr 1 )- 

nkm 2 + nkA + nB. 

Also, 




X* T < i 

nk nk 

(nkm + * X /< + XX e<>), 

i i i 

And 



(b) 

ipp] 

= nkm 2 -f kA + B. 


0 

Now, the sum of squares among families is, 


k nk ’ 


and the expected value of the mean square among families is 



Substituting from (a) and (b) in (c), we have 




hpp - 



- 7 [nkm 2 + nkA + nB — nkm 2 — kA — B] 

n — 1 


= kA -f- B. 


T. A. Bancroft 
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.. HARSHBARGER, BOYD. (Virginia Agricultural Experiment 
Station.) Triple Rectangular Lattices. 

The paper presented an extension of the Rectangular Lattices which 
was published as Memoir 1 by the Virginia Agricultural Experiment 
Station, to the case where there are three groups. All formulas necessary 
for the Triple Rectangular Lattice are given. 

BLISS, C. I. and JACKMAN, M. C. (Connecticut Agricultural 

45 Experiment Station.) Es tima tion of the Mean and Its Error 
from Incomplete Poisson Distributions. Published in Bulletin 

, No. 513, Connecticut Agricultural Experiment Station, January, 
1948. 

When organisms or events occur at random in space or time the 
number of individuals in each unit follows the Poisson distribution. It 
may be necessary or convenient to record in full only the units containing 
few observations such as 0, 1, 2 and 3, combining the rest into a single 
category. Tables have been computed to facilitate the estimation of the 
population mean and its standard error from such incomplete counts. 
Agreement of the observed frequencies with those expected by the 
Poisson distribution can be tested readily by x 2 - The calculation of 
these statistics is illustrated by haemocytometer counts for measuring 
the density of the spores of milky disease in the blood of an infected 
Japanese beetle larva. 

BERKSON, JOSEPH. (Division of Biometry and Medical Sta- 

46 tistics, Mayo Clinic, Rochester, Minnesota.) Comparison of 
Mean-Cost-Rating and the Biserial Correlation Coefficient. 

The mean-cost-rating as previously defined can be estimated con¬ 
veniently and with sufficient accuracy for many practical purposes in 
the cases in which the measurement is normally distributed, by plotting 
the observed utility against cost on normal-normal paper. A line is fitted 
to these points by eye or according to the'formul'a previously given in 
terms of the means and standard deviations of the measured variate: 
The values of the cost are read off for utilities, 0, 0.05, 0.10, 0.15, • ■ ■ , 
0,95,1.00, and from these the mean-cost is calculated by the trapezoidal 
rule. 

137 
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A comparison was made, for some actual series, of the mean-cost- 
rating so obtained and the biserial correlation coefficient. The results 
were inconclusive in respect to any generalization regarding the relation 
between the two. 


BERNSTEIN, MARIANNE E. (Syracuse University.) “Use 
47 of Statistical Methods in Human Genetics.” Test for Mono¬ 
meric Inheritance Involving Four Alleles. Part of a paper to be 
pvblished in “ Journal of Heredity.” 

Danforth (1921) stated that individuals vary as to the presence, 
absence and distribution of hair on the middle segments of the fingers 
and showed that complete absence is a recessive trait. This author offers 
a monomeric-multiple allele hypothesis to explain the distribution of 
hair, calling the alleles (in order of increasing dominance) A 0 , A x , A 2 , 
and A z , the subscripts denoting the number of fingers affected. 

The hypothesis was tested on sibling pairs using Cotterman’s formula 
by which the expected ratio of both siblings dominant to both siblings 
recessive is: 

j: V ;) where p = 1 — (% recessives in sample) 1 ' 2 

Chi-square between observed and expected ratios varied between .047 
and .951 with one d.f. 

In matings of two dominants the percentage of recessive children 
expected is 

\2 


where q = l — p. Though the proposed monomeric mode of inheritance 
involves four gene substitutions, the above formulae developed for two 
gene substitutions could be employed by a grouping process as follows: 


Phenotype 

Recessive offsprings 



of Matings 

Dominant — Recess 

Obs. 


Nq 3 

Random 

N 


mm 


27.6 

40 

A z or An — Ai or A Q 



20.4 

48 

any hair — no hair 

maSM 


47.4 

114 
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JRASHEVSKY, N. (The University of Chicago.) Recent Ad- 
48 vances in Mathematical Biophysics. Symposium on Mathematical 
Biology . 

A general review of the field of mathematical biophysics is given. 
Examples of agreement between theoretical deductions and experimental 
data are given, such as cell division, cell respiration, incidence of cancer 
with age, reaction times, psychophysical discrimination, discrimination 
of intensities and measurement of aesthetic values of visual patterns. 

The paper covers essentially the content of the author's article, 
“Mathematical Biophysics/' (. Medical Physics, edited by Otto Glasser, 
page 706, Yearbook Publishers, Inc., 1944). 


BRANSON, HERMAN. (Howard University.) On the Theory 
49 of Metabolizing Systems with Especial Reference to the Use of 
Isotopic Tracers. Symposium on Mathematical Biology . 

A mathematical treatment of metabolizing systems is outlined which 
describes some important characteristics of such systems in terms of a 
rate function and a metabolizing function. The resulting integral equa¬ 
tions are applied to several problems of biological and chemical interest. 
The equations are solved with functions derived from several sets of 
available data. Experimental procedures for determining the functions 
are discussed. The integral equations are shown to be valuable in 
problems employing isotopic tracers. Evidence is presented to support 
the view that this integral equation formulation may be a convenient 
means of correlating and integrating some of the work now being done 
with tracer molecules in biological systems. 

Some of the .material discussed in this paper has been published in 
The Bulletin of Mathematical Biophysics , 9, 93, 1947. 


- 0 OPATOWSKI, I. (University of Michigan.) Mathematics in 
Radiobiology. ’ Symposium on Mathematical Biology . 

An attempt is made to interpret the carcinogenic action of radioactive 
substances on the basis of a mechanism which is a combination of the 
following ideas: the idea of P. Jordan of an extraneous molecule reaching 
a particular body molecule in a random fashion to induce a macroscopic 
biological event; the idea of an intermediate substance as a more direct 
cause of the cancer; and the idea’of the growth of the cancer from a 
microscopic malignant center. The theory succeeds in describing a part 
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of the result achieved by A. M. Brues, H. Lisco and M. Finkel through 
the induction of bone sarcoma in mice by Sr 89 . 


LANDAHL, EL D. (The University of Chicago.) Mathematical 

51 Theory of Discrimination and Conditioning. Symposium on 
Mathematical Biology . 

The mathematical theory of discrimination and conditioning is 
discussed and applications of the theory to experimental data are illus¬ 
trated by several examples. 

The paper covers essentially the contents of Chapters IX and XI 
of A. S. Householder and H. D. LandahTs Mathematical Biophysics of 
the Central Nervous System. (Prihcipia Press, 1945). 

CULBERTSON, JAMES T. (The University of Chicago.) 

52 Mathematical Theory of Perception. Symposium on Mathe¬ 
matical Biology . 

This paper describes a mechanism for (1) bottle-neck optic nerve 
conduction and (2) the recognition of visual spatial forms. (1) In this 
mechanism the w retinal receptor neurons are in one-one causal relation- 
to a set D of w central neurons, the connecting optic-nerve containing 
less than w fibers. (2) Also a set <f> of y spatial forms (herein defined) 
is in the one-one causal relation to a set F of /x central neurons so that if 
fi (any given number of F) fires, then the corresponding fa has occurred 
in the TetinaLimage. Only fa can fire f { , but /< may fire for any position, 
size or orientation of fa on the retina, with the restriction that fa not 
be smaller than a minimum size, which is a function of retinal position. 
Neuron economy is considered throughout. 

This paper is published in The Bulletin of Mathematical Biophysics^ 
TO, 31, 1948. 

RAPOPORT, ANATOL and ALFONSO &HIMBEL. (The Uni- 

53 versity of Chicago.) Suggested Experimental Procedure for 
Determining the Satisfaction Function of Animals. Symposium 
on Mathematical Biology . 

Some general experimental procedures are suggested to test a previ¬ 
ously developed theory of motivation interactions which gives a quanti- 
; l^tiye description of factors in motivation. The experiments are devised 
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with a view of varying simultaneously the “positive” and “negative” 
terms of a postulated satisfaction function which is supposed to depend 
on the-subject's “effort” and “remuneration.” If the output of effort 
on the part of the subject reaches a steady state for a given set of condi¬ 
tions, this output is taken to be the optimum output under those condi¬ 
tions. In other words, it is supposed that the satisfaction function is 
“maximized”. From the equations describing the derivatives of the 
satisfaction function with respect to different variables, the “biological 
constants” of the individual can be computed and his behavior predicted 
under a variety of other conditions. In particular, on the basis of previ¬ 
ous papers, it should be possible to predict the behavior of two “cooper¬ 
ating” and “sharing” individuals. 

This paper is published in The Bulletin of Mathematical Biophysics , 
9, 169, L947. 


SMITH, ROBERT E. (Naval Medical Research Institute) and 
54 MANUAL F. MORALES (The University of Chicago.) Theo¬ 
retical Studies on Blood-Tissue Exchange of Inert Solutes. 
Symposium on Mathematical Biology . 

Following a brief review of the kinetics of inert gas uptake by com¬ 
posite tissue regions as previously developed by the authors, there is 
made a comparison between this fairly inclusive formulation and other, 
simpler formulations, of which Von Schrotter’s is the fundamental proto¬ 
type. It is shown analytically that the Von Schrotter formulation gives 
a satisfactory asymptotic approximation to the more complete theory 
whenever, (1) it can be assumed that for each tissue the product of the 
area of its exchange surface and the permeability of said surface is much 
greater than the blood flow to the tissue, (2) the average gas concentra¬ 
tion in the blood volume is near the venous concentration, and (3) the 
tissues are arranged in “distinct parallel”. It is concluded that espe¬ 
cially the first of these approximations cannot be made with any assur¬ 
ance on the basis of existing experimental data. 


jg HEARON, JOHN Z. (The University of Chicago.) The Kine¬ 
tics of Blood Coagulation. Symposium on Mathematical Biology 

The overall process of the production of fibrin from fibrinogen is 
considered to occur in two distinct, consecutive phases: (a) the formation 
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of thrombin, T, from prothrombin, P, by the action of “active” thrombo¬ 
plastin, Th', and (b) the conversion of fibrinogen, F, to fibrin by the 
action of thrombin. The system is examined kinetically on the basis 
of the following reactions which accord to Th', P, T and F the required 
roles and include the action of calcium: 


Th + Ca - = 

=rTh' 

Th' + P — 

=; Th' • P 

Th' • P — 

=; Th' + T 

T + F — 

=IT • F 

T ■ F - 

—* T + fibrin 


The rate of the process is formulated in terms of the concentrations of 
the above constituents. The rate expression may be integrated under 
the assumption of a steady state which may be justified on the basis of 
: the irreversibility of the last step and recent data of Ferry et al which 
indicates that T is “tarried down” with the fibrin clot and that the 
ultimate restitution of T to the system is slow relative to the primary 
clotting process. In this manner there is obtained an equation for the 
so called prothrombin time, when the initial conditions are specified 
as being those which obtain for the clinical determination of prothrombin. 
The variation of the prothrombin time, , as a function of P and Th, 
predieted from the analysis is compared to experimental data. The 
results are discussed with particular reference to the plasma dilution 
curves used in the clinical determination of P and the influence of 
various factors such as the potency of the Th preparation employed. 
Specifically from the expression for 4 = /(P, Th, Ca + *) the plasma 
dilution giving a specified uncertainty in 4 may be determined, and a 
linear plot is obtainable. The parameters of straight line may be em¬ 
ployed as a numerical index to the potency and permit construction of 
accurate plasma dilution curves from a minimum number of points. 
The availability of the expression for 4 makes possible the critical exami- 
'nation of many empirical procedures already in practice. Further, 
certaih of such procedures may now be carried out analytically. 

This paper will be published in The Bulletin of Mathematical Bio¬ 
physics, September, 1948. 
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LOTKA, ALFRED J. (Statistical Bureau, Metropolitan Life 
56 Insurance Company.) The Physical Aspect of Organic Evolu¬ 
tion. Symposium on Mathematical Biology. 

' The system made up of a number of component species of living or¬ 
ganisms and their inorganic environment has evolved and continues to 
evolve under a stream of available energy from the sun. The differential 
survival of the several components depends on the degree of the success 
of each in the competition to secure its share of the available energy from 
this stream. 

Analytically, the problem of organic evolution presents itself as the 
study of the distribution and redistribution of matter, as a function of 
time, among specified components of the system of nature. 

Physically, the problem is to investigate the relation of this distribu¬ 
tion and redistribution to the physical properties of the components 
and their energy environment. 

This type of problem is familiar from the study of physicochemical 
systems, in which the distribution and change in distribution of matter 
among specified components (elements, compounds, phases) is examined 
in its relation to parameters of state (volume, pressure, temperature, 
etc.). But, whereas it is characteristic of a large part of the domain of 
physicochemical dynamics that structure and mechanism play at most a 
subordinate role, in the study of organic evolution the structural and 
mechanical properties in terms of which the components must be speci¬ 
fied, and on which their aptitude for capturing energy depends, play the 
dominant role. 

Inasmuch as each component seeks to enlarge its own share of the 
available matter and energy, and since they cannot each monopolize the 
whole, the question arises, what is the collective result of their competitive 
activities? It is in terms of this collective result that we must expect 
to find the law of organic evolution expressed. 

This paper will be published in The Bulletin of Mathematical Bio¬ 
physics, September, 1948. 


MORALES, MANUEL P. and D. JEAN BOTTS (The Univer- 
57 sity of Chicago) and TERRELL L. HILL (University of Roch¬ 
ester.) On the Statistical Mechanics of Antibody-Antigen 
Combination. Symposium on Mathematical Biology . 

Employing T. Teorell’s model, there are derived equilibrium statis- 
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tical equations for the reactions between antibody (A) and antigen (G), 
A + At - 1 ^ AiG; i = 1, 2, - * • n, where n is the antigen “valence”. 
Four hypotheses are considered: (I) The free energy of bonding of a 
single A on a reactive site of G is independent of the state of the remain¬ 
ing sites of G. (II) There is an energy of interaction, E AA , between A } s 
bonded to the same G in such a manner that they are nearest neighbors 
on the lattice of G sites. 1 (Three regular lattices are treated, correspond¬ 
ing to the contact points on any sphere in the cubic and hexagonal 
closest packing of spheres and in the simple cubic packing of spheres). 

(III) The effect of A binding bn the translational and rotational proper¬ 
ties of the A — G aggregate are taken into account assuming that both 
A and G are spherical molecules of approximately the same radius. 

(IV) The effects treated under (II) and (III) are combined. In each of 
the foregoing cases it is shown how the equilibrium constants of the 
reaction system may be obtained from experimental concentration 
measurements. Numerical calculations are given to show that the 
perturbations, (II), (III), and (IV), lead to considerably different results 
than the simple treatment corresponding to case (I). 

SHIMBEL, ALFONSO, and ANATOL RAPOPORT (The Uni- 
58 versity of Chicago.) A Statistical Approach to the Theory of the 
Central Nervous System. Symposium on Mathematical Biology , 

A “probabilistic” rather than a “deterministic” approach to the 
theory of neural nets is developed. Neiiral nets are characterized by 
certain parameters which give the probability distributions of different 
kinds of synaptic connections throughout the net. Given a “state” of 
the net (i.e., the distribution of firing neurons) at a given moment, an 
equation for the state at the next moment of quantized time is deduced. 
Certain very special cases involving constant distributions are solved. 
A necessary, condition for a steady state is deduced in terms of an inte¬ 
gral equation/in general non-linear. 

Published in The Bulletin of Mathematical Biophysics, 10, 41, 1948. 


»A portion of this work will appear in The Journal of Chemical Physics, May, 1948. 




THE BIOMETRIC SOCIETY 


There is a fine old “saw”, no doubt duplicated in the language of 
every country in which the Society has members, to the effect that the 
shoemaker’s child is always without shoes. And, what the Biometric 
Society needs is a statistician. Data, concerning the membership, there 
is in abundance but it doesn’t add up to a nice round number. There¬ 
fore, without benefit of analysis, may we report some “vital” informa¬ 
tion, but, please, no letters to the editor that “the given numbers don’t 
seem to check”, or “the number of charter member's varies by a few” at 
some later date. The entire world may have been drawn close, even too 
close, but it still takes considerable time to smooth out all details over 
distances of thousands of miles. 

By the last of April there were 528 paid members, 370 of them 
charter. This number does not include 51 registered members still 
involved in exchange difficulties. The grand total of 579 included 166 
representing 24 countries other than Canada, Mexico and the United 
States. Eighty-one of these are members of the British Region, includ¬ 
ing a few from Scotland and Ireland as well as England. Australia is 
represented by twenty-three; France by eighteen; Italy by eleven; 
South America by seven from Argentina, Brazil, Peru and Venezuela; 
India and Switzerland by four each; Holland and Sweden by three each; 
Hawaii and Norway by two each; and China, Czechoslovakia, Denmark, 
Malaya, the Philippines, Portugal, Puerto Rico and Trinidad in the 
British West Indies by one each. 

Any young organization, during its growing pains, is apt to need 
some financial assistance and the Biometric Society is no exception. It 
was soon evident that the duties of the Secretary’s office could not be 
handled without an executive assistant and some office equipment, nor 
could the membership be increased to the point of supporting an organ¬ 
ization without some help in the early stages. Officers of the Rockefeller 
Foundation, which aided in the organization last September at Woods 
Hole, were sympathetic to a proposal for supplementary funds to help 
carry out successfully the objectives of the Society. On March 1, 1948, 
the Foundation very generously granted a fund “not to exceed $7,400, 
of as much thereof as may be necessary, to Yale University for the sup¬ 
port of the Biometric Society for the period ending February 28, 1951.” 
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The University has now received $4,000 to cover expenditures during the 
first year under the terms of the grant. Because the tax status of the 
Society cannot be established for a year, Yale University agreed to 
receive the funds for disbursement on order of the Society. 

The grant was made on the basis of providing a part-time assistant 
for the Secretary’s office, the purchase of necessary office furniture and 
supplies, and to cover printing, postage, and publicity for promotion. 
Further, the money is to aid in arranging for and financing international 
conferences as well as regional meetings, and in maintaining a suitable 
journal. Biometrics was selected by the Council as the journal which at 
present suits the needs of the Society. 

Mrs. John H. Watkins, who has been acting informally since last 
November, has been appointed Executive Assistant. Yale University 
has offered office space, rent-free, as soon as it becomes available. Until 
that time the Watkins’ are letting us use their home as a temporary office 
and have given space for the furniture and equipment. 

Only two regions have been formally established. The British Re¬ 
gion held an inaugural meeting at University College, London, on April 
29, 1948, to adopt a set of rules and make formal arrangements for con¬ 
duct of the Region’s affairs. Professor R. A. Fisher, President of the 
Society and Dr. J. W. Trevan, Vice-President for the British Region, 
addressed the meeting. 

Charles P. Winsor, Vice-President of the Eastern North American 
Region, has appointed the following committees for the year: Program 
committees to cooperate with 

1. The A.A.A.S.: Kenneth S. Cole, D. B. DeLury, M. Demerec, 

, N. Rashevsky, G. G. Simpson and W. H. Youden, chairman. 

2. The A.S.A. : Churchill Eisenhart,- H. C. Fryer, Oscar Kemp- 

thorae, P. J. Rulon, W. R. Thompson and H. W. Norton, chairman. 

3. The A.P.H.A.: H. L. Dunn, Margaret Merrell, Jane Worcester 

and Hugo Muench, chairman. 

4. Federation of American Societies of Experimental Biology: 

E. J. deBeer, H. K. Hartline, Lloyd Miller and C. I. Bliss, chairman. 

A French and a Benelux Region were approved by the Council in 
1947, but neither has as yet completed organization. Adriano Buzzati- 
Traverso, a member of the Council, wrote late in February that “episto¬ 
lary discussion is under way between Professor Georges Teissier”, also 
a Council member, and himself “in order to make definite proposals to 
establish a Region, which should include members of the Society of 
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France, Italy, Switzerland, and eventually the Benelux Region.” 
Professor Teissier wrote the last of April that he expects to see the 
Italians, who are anxious to join a regional group, at Stockholm and at 
Paris during the Genetical and Zoological Congresses. He adds that if 
the start is a little slow, “les choses ne vont n6anmoins pas trop mal.” 

Organization is going forward on the Australian Region with a plan 
that one person in each of various cities organize local groups which will 
meet at intervals. These group organizers with M. H. Belz, a member of 
the Society Council, will constitute an Australian Regional Council. 
Thus far the city representatives are E. A. Cornish, officer-in-charge, 
Section Mathematical Statistics, Council for Scientific and Industrial 
Research; and lecturer in mathematical statistics at the University of 
Adelaide, for Adelaide; Rupert Leslie, Section Mathematical Statistics, 
CSIR, attached to Division of Forest Products, for Melbourne; and 
Helen N. Turner, Section Mathematical Statistics, CSIR, attached to 
Division of Animal Health and Production and lecturer in Veterinary 
Biometry, University of Sydney, for Sydney. 

R. A. Fisher, President of the Society, was one of two foreign associ¬ 
ates elected at the annual meeting of the National Academy of Sciences. 

The Secretary’s office has received copies of Ciencia e Investigacidn,, 
published in Buenos Aires, Tydschrift voor Sodale Oeneeskunde, The 
American Statistician and the Statlab Review of Iowa State College which 
contain accounts of the Society’s organization. Other journals contain¬ 
ing such accounts, or reprints from them, will be appreciated in order to 
keep as complete a publicity file as possible. 



NEWS AND NOTES 

The University of Michigan Summer Session (Ann Arbor) will offer a 
special four weeks session in survey research methods. The program 
will include introductory and advanced courses in survey research and 
sampling methods as well as a course in methods of statistical analysis. 
The survey research course will cover study design, questionnaire con¬ 
struction, interview technique, coding methods and related material. 
The staff will include Rensis Likert, Angus Campbell, Charles Cannell, 
Roe Goodman, George Katona, Daniel Katz, Leslie Kish, Eleanor. 
Maccoby, and Charles Metzner of the staff of the Survey Research 
Center. Morris Hansen, William Hurwitz and Benjamin Tepping of the 
Bureau of the Census will offer the advanced sampling courses. Other 
special lecturers, will participate in the program. All courses will be 
given July 19 through August 13, 1948. The introductory course in 
survey methods is being offered June 21 to July 17. All courses are 
offered for graduate credit and students must be admitted by the 
Graduate School. 

CHINA — Wang Chien-ming, College of Agriculture, Sun Yat-Sen 
University, Canton, writes: “During the second world war we have been 
robbed of quite a number of textbooks, reprints, bulletins and other 
periodicals. I think you may take it as a pleasure to cooperate with us 
by sending publications in connection with statistical treatment of 
biological assays, biometry and field technique.” ¥ 

ENGLAND — K. A. Brownlee, Research and Development Depart¬ 
ment, The Distillers Company, Ltd., Great Burgh, Epsom, Surrey, 
finds that much the greater part of his work is devoted to the field of 
biometrics. The Company runs a number of fermentation processes 
and the usual tests of significance are desirable for analysing process 
data. Along the lines of biological assay, they have greatly increased 
the accuracy of the plate-cup assay for penicillin and streptomycin by 
the use of a doubly confounded layout. For some large scale experi¬ 
ments, particularly on the manufacture of penicillin, they have used 
factorial designs, generally with confounding, and sometimes fractionally 
replicated. . . . D. J. Finney, lectureship in the design and analysis of 
scientific experiment, University of Oxford, has actively contributed to 
the development of Biometrics. He states, “I think the journal is very 
well worthwhile, and the first number of Volume 3,1 find particularly 
useful. The only suggestion I have to make is that the column of social 
gossip be abandoned.” This is the second time such a suggestion has 
been received during the last three years. Do the rest of our readers 
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feel the same about“News and Notes”? ... J. B. S. Haldane, University 
College, London, Gower Street, writes, “I am just trying to do some biom¬ 
etry on Ethinocardium, which is about the only solid box other than 
the human skull on which large numbers of measurements have been 
made. It looks as though one would have to apply the Thurstone type 
of analysis to it.” ...LB. Perrott, 17 Widney Manor Road, Solihull, 
Birmingham, a mathematician and statistician at Leicester College of 
Technology, lectures to advanced students. He is interested in mathe¬ 
matical statistics in general, and in the design of experiments in par¬ 
ticular. Mr. Perrott has worked on the incidence of certain diseases in 
various industries. As a statistical consultant, he advises senior research 
scientists (physicists, chemists and biologists) on the design of their 
experiments and assists with the analysis of the results.... E. C. Wood, 
Virol Limited, Hanger Lane, Ealing, is to give a talk on “Statistical 
aspects of chemical analysis” the first of June at the Netherlands Chem¬ 
ical Society, International Congress in Utrecht, (Holland). 

FRANCE —R. Fortet, in the faculty of science at the University of 
Caen (Calvados), has joined the Biometric Society. His field of special¬ 
ization is probability theory, more particularly Markoff chains and 
stationary stochastic processes. He writes, “I am interested in your 
Society and its review.” 

NORWAY —From the Agricultural College of Norway, Vollebek, 
Oivind Nissen attended the Plant Breeding Conference January 26-30, 
1948 at Raleigh, North Carolina. He is a plant breeder who is working 
on forage species, primarily clover and timothy. During a second visit 
on March 9, Mr. Nissen gave an illustrated seminar talk on plant breed¬ 
ing in Norway. . . . From the same College comes word from a charter 
member of the Biometric Society, Per Ottestad. In 1931 he was ap¬ 
pointed assistant in the research institute (marine biology) of Professor 
Hjort and was occupied with biological research. • He states, “Gradually 
I became interested in statistics because I began to understand that this 
science was neoessary for the development of biological research.” In 
1937 he was appointed assistant professor of mathematics at the Agri¬ 
culture College. “Our students are not trained in calculus and, there¬ 
fore, it is not possible to explain to them how the various theorems have 
been deduced mathematically. I try to explain the fundamental logical 
principles of statistics by means of examples and exercises. Experimental 
design is no general subject for teaching in our college. We are now 
working with a revision of the curriculum, and the idea is to introduce a 
course on general scientific methods such as classification, deduction and 
induction, analysis and synthesis, and experimental design.” 
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PHILIPPINES —Vincente Mills, Tuguegarao Branch Office, U. S. 
Philippine War Damage Commission, Cagayan, sends this encouraging 
message, “I believe the International Biometric Society is the organizar 
tion which can effect most successfully the systematic and progressive 
development of biometry.” The U. S. Philippine War Damage Com¬ 
mission is rehabilitating the country from the ravages of war, for which 
purpose the Congress of the U. S. has authorized the appropriation of 
five hundred twenty million dollars for compensation of private and 
public claims. The Commission is under the leadership of Frank A. 
Waring, Chairman, and the Commissioners Francisco A. Delgado and 
John A. O’Donnel. Mr. Mills hopes that by the time their task at 
economic reconstruction shall have been completed, he will have suffi¬ 
cient qualitative facts and quantitative data to be of value in subsequent 
economic and econometric studies. As Assistant Census Commissioner 
for the 1939 Census of the Philippines, Mr. Mills became interested in 
the problems on population, mortality, morbidity, and other biostatis- 
tical data. 

VENEZUELA —Eric Michalup, Actuary, Caracas, writes that he 
read with great interest the short paper by Margaret Merrell, Volume 3: 
129-136, 1947 of Biometrics. We appreciate your suggestions which 
have been followed. It would be most helpful if more of our readers 
would take time to say what you want. Our efforts to secure articles 
being requested have not been very successful, but we can keep on trying. 

UNITED STATES —Robert A. Harte, Chief Research Chemist, The 
Arlington Chemical Company, Yonkers, New York, is particularly 
interested in allergy. He writes, “There are many problems in allergy 
which call for statistical handling and we have quite a list of projects 
which we.should like to undertake. An extremely interesting application 
of statistical methodology to allergy has been initiated in two papers by 
T. G-. Andrews {Journal of Allergy 14:322,1943; 19:43,1948) in which 
factorial analysis has been applied to the responses of a relatively large 
group of individuals to skin reactions.” . . . Jerome C. R. Li, Assistant 
Professor of Mathematics, Oregon State College, Corvallis, is doing 
teaching and consultant work in statistics. He has started a sequence of 
two statistics courses for the agricultural students. Computing facilities 
have been provided for student use. He writes, “The Climate is excel¬ 
lent. Definitely there is the possibility of further expansion in our 
statistics program.” ... H. M. C. Luykx, New York University, College 
of Medicine, is interested in the application of statistical methods in 
medicine and public health. More articles demonstrating the value of 
statistical tools axe welcome. ... Sophie Marcuse, formerly statistician 
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with the Bureau of Human Nutrition and Home Economics is now with 
the Naval Research Laboratory. She is still in Washington and writes, 
“The material is different but again it is design that is important in 
experimentation.”... M. B. Mittleman, Management Counsel of M. B. 
Mittleman Associates, New Rochelle, New York, calls himself an avocar 
tional biometrician. He writes, “For the past 12 years I've been an 
avocational herpetologist (majored in Zoology at Ohio University). For 
perhaps the first half of these years, I was vaguely unhappy about the 
crudity of the taxonomic efforts of myself and others. Having pulled 
about the same number of taxonomic boners as my colleagues, I felt 
that herpetology should not be any less amenable to refinement than the 
other classificatory sciences. Laurence M. Klauber’s papers provide a 
tour de force of applied biometrics, and I have found them stimulating 
and provocative.” Mr. Mittleman reports that his vocational applica¬ 
tion of statistics is in such fields as marketing forecasts and performance 
analysis.... Max Shiftman, a member of the faculty of the department 
of graduate mathematics at New York University, will join the Stanford 
faculty in September as Professor of Mathematics. During the war he 
was a research mathematician with the Office of Scientific Research and 
Development. . . . George W. Snedecor, President of the American 
Statistical Association and visiting Research Professor of Statistics at 
Alabama Polytechnic Institute, addressed a joint session of the following 
sections: Biology and Medical Science, Industry and Economics, and the 
Social Sciences, on the subject, “Increasing the Efficiency of Sampling 
Investigations,” at a recent meeting of the Alabama Academy of, 
Science. ... Francis Joseph Weiss, Special Research Consultant, Sugar 
Research Foundation Inc. states, “I am looking forward to having, 
through your Biometric Society, closer contact with biologists and bio¬ 
chemists who like me are interested in the mathematical presentation 
of biological phenomena.” . . . H. G. Wilm formerly with the Rocky 
Mountain Forest and Range Experiment Station at Fort Collins, Colo¬ 
rado, is now with the Southern Forest Experiment Station with head¬ 
quarters at New Orleans, Louisiana. He will conduct flood-control 
survey work in the Forest Service’s Southern Region and adjacent terri¬ 
tory, covering an area which extends in general from east Texas to the 
south Atlantic Coast. A part of his job will be adapting and applying 
sampling techniques and other statistical methods to flood-control 
surveys. 
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THE ANALYSIS OF COVARIANCE* 

D. B. DeLury 

Virginia Polytechnic Institute 
and the Ontario Research Foundation 


SECTION I 

T ihe whole op this discussion is based on the data of a single 
experiment, the details of which have been published under the 
title “The Effect of Atropine and Quinidine Sulphate on Atrophy and 
Fibrillation in Denervated Skeletal Muscle.” [1] 

For the present, we may adopt the view that the experiment was 
conducted to compare the effects of four different drugs in delaying the 
atrophy of denervated muscles. The structure of the experiment was, 
briefly, as follows. A number of rats were put randomly into four groups; 
a certain muscle in the hind leg of each rat was deprived of its nerve 
supply by severing the appropriate nerves, the leg to be denervated 
(right or left) being chosen at random. Each of the four groups was 
assigned to treatment by one of the drugs and the treatments were 
continued throughout the course of the experiment. Four days after 
treatment was begun, four rats were chosen randomly from each of the 
four groups and measures of atrophy were obtained from them. This 
procedure was repeated after 8 and 12 days. 

Atrophy is usually measured by the loss of weight of the muscle after 
denervation. The weight of the muscle can be determined only^by 
killing the animal and removing and weighing the muscle; hence the 
weight of the muscle at the beginning of the experiment is not known and 
a direct measure of weight loss is not obtainable. The device ordinarily 
used to circumvent this difficulty is to obtain the weight of the same 
muscle from the other leg (which was not denervated), at the same time 
as the denervated muscle is taken, and to assume that the weight of the 
intact muscle, at the end of the experiment, is the same as that of the 
denervated muscle at the beginning of the experiment. 


*A revision of an expository paper given at a joint meeting of the Biometrics Section and the' 
Institute of Mathematical Statistics, held in conjunction with the 113th Annual Meeting of the Amer¬ 
ican Association for the Advancement of Science, Boston, Massachusetts, December 28, 1946, Seotiotti 
I contains the substance of the paper as presented, Section II is based on comments given by Dr. C. L 
Bliss in a prepared discussion. Seotion III has been added later, 
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TABLE I 


Days 

Drug 

body 

weight 

muscle 

weight 

fibrillation 

w 

initial x 

final y 

denervated u 

intact v 


A 

217 

196 

0.94 

H 

mm 

4 

large 

246 

218 

1.16 ■ 

■ 

B9H 


atropine 

256 

216 

1,26 


H' ' 



200 

165 

0.85 

1.01 

mMm 


B 

198 

202 

1.19 

mm 

8.5 

4 

moderate 

248 

231 

1.15 

mm 

7 


quinidine 

180 

187 

0.86 

WBm 

17.5 



218 

230 

1.21 

mm 

16.5 v 


C 


231 

1.22 

1.34 

10 

4 

moderate 


170 

0.90 

1.00 

12 


atrophine 


189 

1.00 

1.03 

7 




185 

1.00 

1.14 

14.5 


D 

181 

193 

0.99 

1.17 

12 

4 

saline 

266 

285 

1.51 

1.73 

14 



274 

266 

1.55 

1.75 

17.5 



180 

188 

0.98 

1.15 

12.5 


A 

265 

183 

0.91 

0.91 

5 

8 . 


248 

190 

0.73 

0.89 

7 



238 

166 

0.52 

0.77 

14 



180 

169 

0.65 

0.97 

• 13 



186 

200 


1.24 

6.5 

8 

B 

. 220 

221 


1.42 

9 



199 

230 


1.40 

11 



240 

246 


1.38 

9 



178 

162 

mtm 

■ 

4 * 

8 

C 

188 

181 

■H 

■ 

9 



250 

235 



9 



195 

182 

0.75 


10 



194 


0.97 

mgmu 

6.5 

8 

D 

274 

287 

1.07 

BIBI 

14 



222 

237 

1.16 


15 

’ 1 ■' „ 

V . 

274 

243 

1,04 

1.69 

13 



’ 198 

ites 

0,34 

mm 

■ 5 1 

$ 12 , 

a .;, 

175 

150 

0,43 


5 



, 199 

159 

0.41 :' 

■11 

15 



224 

m 

J ■ 0.48 

19 



V 

283 

| -: -• 242 « 

0.41 ^ 

1. 08 • 

u 

'■ . 12 | 

, & ! 

250 

226 

0.87 

1.30 

3 



289 

300 

*■ >0.91 '■ 

1.67 

15 



.. 255 

252 

0.87 ♦ 

1.52 

11 



204 


0.57 

0.97 

16 

12 . ,-i 

;■ %■? ; 

. 234 

■IBs 

0.80 

1.10 

13 



211 


. 0.69 

0.87 

22 



214 

200 

0.84 

1,22 

15 

r:J-v ■■•.y : v. : j 


186 

243 

0.81 

MEM 

' 4' ' ' ' 



286 

297 

1,01 

iji 

4 


••• 

245 

264 

0.97 

■n 

' , 7 ■ 


^V:r.^a v ; ■. 

FVv. • -»•• •*? W ••••' 


|‘ 228 

0.87 

m 

■■ 
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We might expect that the difference between these two numbers 
(intact minus denervated) would be used in whatever analysis is under¬ 
taken to compare the effects of the four drugs. However, it seems to be 
generally accepted that the ratio , (denervated)/(intact), is the proper 
combination of these two numbers to use. 

Table I contains the measurements yielded by the experiment. For 
the moment, we are concerned only with the weights of the denervated 
and intact muscles. The ratios (total denervated)/(total intact), and 
the differences, for the 12 sets of 4 rats, are listed in Table II. 


TABLE II 



Ratios 

Differences 


mi 

B 


D 

A 

B 

C 

D 

4 days 


.88 

H 

.87 

.47 

.57 


.77 

8 days 

.79 

.69 

.76 

.65 

.63 

.69 

1.03 

1.24 

12 days 

.65 

.55 

.70 

.55 

.90 

2.46 

1.26 

3.03 


T 


Putting aside the question of a test of significance, it would appear 
that responses to the drugs are of two kinds; A and C are alike, B and D 
are alike, but there is a perceptible difference between the two pairs. 
This decision is reached on the basis of either the proportions or the 
differences. 

Now drug D was, in fact, simply a saline solution and could have had 
no effect on atrophy; drug B was a moderate dose of quinidine sulphate, 
which might have acted to delay the deterioration of the denervated 
muscles, but apparently did not to any considerable degree; drugs C and 
A were moderate and large doses of atropine sulphate which, we might 
conclude, did exert some beneficial influence in retarding the progress 
of atrophy. 

Altogether this makes a not unreasonable picture. However, let us 
inquire.more closely into the numbers which have gone into this simple; 
analysis. Why should the ratio (denervated)/(intact) be used? Is 
there any reason why we should not conduct the analysis on the weights 
of the denervated muscles alone?\ 

The answer to the second of these questions is obvious: an analysis 
based on the weights of the denervated muscles would he legitimate, but 
would probably be very insensitive, because differences among the initial 
^weights of these muscles would enter into the analysis as experimental 
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error and might well obscure real effects. Furthermore, we must con¬ 
clude that the weight of the intact-muscle is introduced in the ratio 
(denervated)/(intact) in the hope of preventing these differences among 
initial muscle weights from inflating the experimental error. It should 
be clear, however, that this device is very unlikely to succeed, even if the 
intact muscle weight were identical with the initial weight of the denervated 
muscle, because to suppose that the ratio (denervated)/(initial) is 
independent of the initial weight is equivalent to assuming that the 
amount of atrophy is proportional to the initial weight. The constant 
of proportionality could, of course, change from one treatment to 
another, but in any case this assumption is one that rarely is met in 
practice. Likewise, if we were to base an analysis on the difference, 
(initial) — (denervated), in the expectation of removing the effects of 
variation among the initial weights, we would be successful only if the 
amount of atrophy were independent of the initial weight. This assump¬ 
tion is not appreciably better than that of proportionality. . 

When, to these weaknesses; is added the obvious fact that the final 
weight of the intact muscle is certain to differ from the initial weight of 
the denervated muscle, the use of the intact muscle weight, in either a 
proportion or a difference, is seen to be open to serious objection. 

Of the many possible reasons why the final intact muscle weight is 
likely to differ from the initial weight of the denervated 'muscle, two 
deserve special mention. 

(1) Throughout the course of the experiment, the animal may grow 
normally, or nearly so, apart from the denervated muscle. The 
intact muscle is therefore heavier at the end of the experiment 
than it was at the beginning. This does not, in itself, render the 
intact muscle undesirable for use in the analysis. Indeed, the 
effect may be entirely good, because the intact muscle weight 
would ifeflect the weight that the denervated muscle would have 
attained if the nerve had not been cut. * 

(2) The drugs employed may affect directly the weight of the intact 
muscle. If this is the case, the consequences of using this number 
to indicate the initial weight of the denervated muscle are 
devastating. It was in anticipation of this possibility that rec¬ 
ords of the initial and final weights of the animals were kept in 

- this experiment. 

Only a glance at Table I is needed to see that the drugs did affect the 
weights of the intact muscles. The totals of the weights of both intact 
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and denervated muscles are shown in Table III. (From this point on, 
the decimals are dropped from the observations on u and v.) 

Recalling that, on the basis of either ratios or differences, it was 
decided that treatments A and C might be producing beneficial effects 
and that B and D were not, it is somewhat startling to observe that the 
weights of the denervated muscles are lower in the A and C groups than 
in the B and D groups. It is seen too that the same thing is true of the 


TABLE m 



A 

B 

c 

D 


den. 

int. 

den. 

int. 

den. 

int. 

den. 

int. 


u 

V 

u 

V 

u 

V 

u 

V 

4 days 

421 

468 

441 

500 

412 

451 

503 

580 

8 days 

281 

354 

375 

544 

322 

425 

424 

650 

12 days 

166 

256 

306 

552 

290 

416 

366 669 


intact muscle weights. Now we can see clearly what has happened. 
The ratios computed for the A and C drugs were high, not because their 
numerators were large (the reverse was the case) but because their 
denominators were small. Similar remarks apply to the differences. We 
see also that the conclusions reached by means of ratios and differences 
are exactly the reverse of those which ought to be reached (assuming for 
the moment that some conclusion is warranted). 

It should be clear at this point that if we are to make allowance in our 
analysis for the variation among the initial weights of the denervated 
muscles, we must have either their values or measurements on some 
variable which follows their values as closely as possible and which is not 
subject to the influence of the drugs. The initial weights we cannot 
obtain, but it seems reasonable to suppose that these initial weights are 
closely related to the total weights of the animals at the beginning of the 
experiment. This supposition should, of course, be checked against 
experimental evidence. The technique employed to perform this check 
and to make proper allowance in the analysis for variation among the 
initial weights is known as The Analysis of Covariance. 

The analysis of covariance is essentially an application of standard 
regression theory to the problem at hand. It seems obvious that the 
dependence of the final weight of the denervated muscle on its initial 
weight, or on some variable which is correlated with it, is likely to lie 
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somewhere between the extremes of proportionality and additivity which 
were examined earlier. Clearly it is necessary that we assess this de¬ 
pendence from the observations of our experiment, and this is precisely 
the job for which the theory of regression was developed. This situation 
differs from the general regression problem only through being somewhat 
simpler. The simplicity derives from the fact that the observations to 
which regression methods are to be applied come from a balanced 
experiment. 

The computations required are exhibited in Tables IV and V. An 
ordinary analysis of variance is performed on the variables x (initial total 
body weight) and u (final denervated muscle weight). The results of 
these computations are entered in the first four rows of Table IV, in the 
columns labelled (x 2 ) and (u 2 ). The column headed (xu) gives the break¬ 
down of the sum of the products of x and u, according to the same rules 
as are used in separating the sums of squares of x and of u. 


TABLE IV 

SUMS OF SQUARES AND PRODUCTS 


row 


d.f. 

<W 

(xu) 

(«*) 

1 

times 

2 

. 264 

—1736 

13269 

2 

drugs 

3 

2969 

3461 

7931 

3 

times X drugs 

6 

9063 

2083 

1529 

4 j 

error 

36 

38244 

14562 

10111 

5 

times + error 

38 

38608 

12826 

23380 

6 

drugs + error 

39 

41213 

18013 

18042 

7 

times X drugs + error 

42 

47297 

16645 

11640 


TABLE V 


SUMS OF SQUARES FOR REGRESSION AND DEVIATIONS FROM REGRESSION 


row 


regression 
d,f, s.s. 

deviations i 

d,f. $.8. 

m.B. 

l ] 

error 

1 

6545 

35 

4566 

130 

2. ' 1 

times + error 

1 

4272 

37 

.19108 


3 

drugs + error 

1 

7873 

38 

10169 

1 / 

« 

times X drugs + error 

1 

5858 

41 

5782 


5 

times 



2 

14542 

7271 . 

6 

drugs 



3 

5603 

1868 

7 , 

times X drugs 



6 

* 1216 

203 


Only the simplest arithmetic is involved in passing from Table IV to 
Table V. The s.s. in the error row of Table V are obtained by combining 
the numbers, in the error row in Table IV; the s„s. for regression is 
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(14562) 2 /(38244) and the s.s. for deviations from regression is 10111 — 
(14562) 2 /(38244). The rows for (times + error), (drugs + error) and 
(times X drugs + error) in Table V are computed according to the 
same rule from the corresponding rows of Table IV. The entries in 
rows 5, 6 and 7 of Table V are obtained by subtracting the numbers in 
row 1 from those in rows 2, 3, 4. 

Underlying this calculation there is envisaged a regression line, 
JJ — u = b(x — x), which expresses the dependence of u on x. This line 
is fitted using the sums of squares and products from the error row of 
Table IV, in the expectation that any dependence detected here will not 
be reflecting time or drug effects. The slope b of the line is therefore 
(14562)/(38244) = .3808. 

This line can be used to compare two observations or two averages 
of u, undisturbed by differences between their corresponding rc-values, 
provided the line expresses adequately the dependence of u on x. If 
(uiX i), (u 2 x 2 ) are two pairs of observations or averages, it is seen that 
Ui and u 2 may be expected to differ by b(xi — x 2 ), by reason of the differ¬ 
ence between their z-values and allowance should be made for this fact 
in trying to interpret the difference (u x — u 2 ). In practice, instead of 
evaluating each pair separately, it is simpler to reduce each (u, x) pair 
to an equivalent pair («', x), where x is the grand mean of x. Differences 
among the u'-values will then be unaffected by differences among the *’s. 

Totals of x and u, taken from Table I, and the adjusted values u' are 
given in Table VI. 


TABLE VI 


DENERVATED MUSCLE WEIGHTS, ADJUSTED FOR INITIAL BODY WEIGHT 



A 

x u u f 

B 

X u u' 

C 

X u u’ 

D 

X u u f 

4 days 

919 421 411 

844 441 469 

866 412 422 

901 503 499 


931 281 266 

845 375 393 

Sll 322 3^3 

964 424 396 

12 days 

796 166 202 

1027 306 254 

863 290 301 

932 360 351 


The manner in which the adjusted values v! vary from treatment to 
treatment points to substantial differences among the responses to the 
four drugs. We may well feel that a test of significance for these differ¬ 
ences is required here. The numbers in Table V' were calculated to 
provide such tests. The mean squares in the last column of Table V 
may properly be compared in an F-ratio. Thus F = (1868)/(130) w- 
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14.4, with 3 and 35 d.f,, tests the drug mean square against that for 
error. The mean squares for time and time X drug may be tested in the 
same way. 

A test of significance for b seems to be not particularly necessary. 
After all, the variable x was selected with considerable a priori assurance 
that u would be closely related to it. However, such a test, if desired, is 
readily obtained from numbers already calculated. In the error row of 
Table Y, the 1 d.f. for regression exhibits the amount by which the 
“total” error s.s. is diminished by introducing the regression of u on x; 
a comparison of this with the residual s.s. indicates the effectiveness of 
the regression in reducing variation. F = (5545)/(130) = 42.6, with 1 
and 35 d.f., shows that b is highly significant. 

... The analysis thus far shows clearly that the drugs produced impor¬ 
tant effects on the denervated muscles and that these effects were all 
harmful, in the sense that they hastened loss of weight. Reference to 
Table I shows that this loss of weight was, to some extent, general, 
because total body weight aiid the intact muscle weight both decreased 
under some of the treatments. It is reasonable to ask, then, if the drugs 
had any specific effect on the denervated muscle, because it may be that 
the excessive weight loss of the denervated muscle was simply the result 
of the decrease in total body weight. 

This question may be explored by following a little further the ap¬ 
proach which has already been employed, that is, by fitting a regression 
equation to express the dependence of final denervated muscle weight on 
both initial and final total body weights and examining the residual 
variation. Final total body weight (y) was recorded in the experiment 
with this purpose in view. 

The regression equation may be written 

1 U ~ u = b x (x - x) + b 2 {y - y)- 
The coefficients b x and b 2 are determined by the normal equations 
(x*)bi + (xy)b 2 = (xu) 

(ty/)bi + (y 2 )b 2 = (yu) 

in which the bracketed symbols denote sums of squares and products of 
deviations from means. The sum of squares of residuals is given by 
(u 2 ) — b^xu) — b 2 (yu), which becomes, on substituting for b t and b 2 , 

(u *\ _ ~ 2(xy)(xu)(yu) + (/)(ro) 2 

K } (*W) - ixyf • 
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All this is standard regression theory. This formula may be used to 
partition the total s.s. (w 2 ) into two parts, one attributable to regression 
and the other representing deviations from regression. The arithmetic is 
carried out exactly according to the pattern of Tables IV and V. The 
results are given in Tables VII and VIII. 


TABLE VII 

SUMS OF SQUARES AND PRODUCTS 



d.f. 

0*0 

(xy) 

(j/») 

(xu) 

(yu) 

m 

times 

<2 

264 

287 

414 

-1736 

-1450 

13269 

drugs 

3 

2969 

8543 

35228 

3451 

15782 

7931 

times X drugs 

6 

9053 

6671 

8087 

2083 

2457 

1529 

error 

36 

38244 

26612 

25108 

14562 

11972 

10111 

times 4- error 

38 

38508 

26899 

25522 

12826 

10522 

23380 

drugs 4- error 

' 39 

41213 

35155 

60336 

18013 

27754 

18042 

times X drugs 4* error 

42 

47297 

33283 

33195 

16645 

14429 

11640 


TABLE VIII 


SUMS OF SQUARES FOR REGRESSION AND DEVIATIONS FROM REGRESSION 


row 

regression 
d.f. s.s. 

deviations 
d.f. s.s. 

m.s. 

1 

error 

2 

6058 

34 

4053 

119 

2 

times 4“ error 

2 

4635 

36 

18745 


3 

drugs 4- error 

2 

12930 

37 

5112 


4 

times X drugs 4” error 

2 

6612 

40 

5028 


5 

times 


* 

2 

14692 

7346 

6 

drugs 



3 

1059 

353 

7 

times X drugs 



6 

975 

162 


It is interesting to observe the effect of introducing the final total 
weight into the analysis. The error mean square has not been affected 
appreciably (130 to 119), but the drugs mean square has been reduced 
from 1868 to 363. The ratio F = (353)/(119) = 2.97, with 3 and 34 d.f., 
is just below the 6% point, but whether or not this is judged to be sig¬ 
nificant, it is dear that by far the greater part of the differences among 
drugs is accounted for by their effects on the total weights of the animals. 
Values of denervated muscle weight, adjusted to equalize both initial 
and final total body weights, are given in Table IX. 

We have, in the foregoing analysis, examples of the two chief uses of 
the analysis of covariance. The initial total weights were brought into 
the analysis to provide control over a source of disturbance which did 
not lend itself to experimental control. It is characteristic of variables 
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TABLE IX 

DENERVATED MUSCLE WEIGHTS, ADJUSTED FOR INITIAL AND FINAL TOTAL 

BODY WEIGHTS 



A 

B 

C 

D 

4 days 

429 

448 

435 

526 

8 days 

311 

368 

360 

586 

12 days 

241 

231 

323 

557 


which are used for this purpose that they are independent of treatment 
effects. Here the interpretation of the analysis is clear-cut. We are not, 
as a rule, concerned with testing the dependence of one variable on 
another. Rather, we require a priori confidence that the variable 
selected does furnish the desired control. 

The final total weights were introduced for a wholly different pur¬ 
pose, not to establish control over a source of variation, but to aid in 
understanding the results of the experiment. Here, the earlier analysis 
established the fact that the drugs did produce substantial effects on the 
weights of the denervated muscles. The later analysis was undertaken 
to find out the manner in which the effects were brought about. The 
variable introduced for this purpose, final total weight, is affected by the 
treatment^. This is more or less typical in this second use of covariance 
analysis. In cases of this kind, considerable caution must be used, both 
in deciding whether or not to introduce the variable and in the inter¬ 
pretation of the results. 

We have not, in this second case, calculated a significance test either 
for 8* or for the combined effects of 8, and 6 a • These tests are easily 
computed, but there is no reason to do so here, since none of our con- 
elusions depends on tests of these quantities. This need not always be 
the c®se, Indeed,' in some instances, the primary reason for conducting 
an analysis of covariance is to perform a test of significance. A case of 
this kind .occurs in the experiment we have been diseussiflg. 

Shortly after denervation, a muscle exhibits a random twitching 
called fibrillation, which persists with gradually decreasing intensity 
until atrophy is complete. It has been conjectured,that fibrillation 
causes atrophy and therefore, that a treatment which diminishes the 
intensity of this twitching should delay the progress of atrophy. The 
drugs used in this study were chosen because of their known effects on 
fibrillation. Measurements of the intensity of fibrillation (w) were made 
by electrical methods. The records are given in Table I. 

•If a lowering of the intensity of'fibrillation does in fact decrease the 
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rate of atrophy, we should find a negative correlation between the vari¬ 
ables^ and w. It is clear, however, that a correlation coefficient calcu¬ 
lated in the ordinary way from the 48 pairs of observations on u and w 
may be misleading, because each of the variables in the correlation is 
affected both by the drugs and by the elapsed time. These sources of 
disturbance may be avoided by using the sums of squares and products 
from the error row to calculate the correlation coefficient. (Of course, a 
regression coefficient could be used here instead of a correlation coeffi¬ 
cient.) This computation, then, will be based on Table X. 


TABLE X 

SUMS OF SQUARES AND PRODUCTS 



d.f. 


(uto) 

' 

(«*> 

times 

2 

13269 

801.8 

67.28 

drugs 

3 

7931 

-176.6 

16.81 

times X drugs 

6 

1529 

182.2 

310.55 

error 

36 

10111- 

-170.1 

520.19 


The coefficient of correlation calculated from the error row is 

_ -170.1 _ _ 

r [(10111)(520.19)] 1/a ‘ U7 

which may be tested for significant departure from zero by entering a 
table prepared for this purpose with n = 35 (e.g. It. A. Fisher, Statistical 
Methods for Research Workers, Table V. A.). This value of. r is much 
too small to be judged significant. 

Only the last row of Table X has been used in this calculation. 
Ordinarily, therefore, these numbers would be calculated directly, 
omitting the other rows of Table X. However, when interaction terms 
supply the error row, it is simplest to calculate the whole table. Also, it 
iB occasionally of interest to calculate a correlation or regression coeffici¬ 
ent from rows other than the error row. In Table X, the times row yields 
a correlation coefficient r — .8478 and in the drugs row, r = —.4737. 
These correlations are based on too few degrees of freedom to warrant 
much discussion, but their magnitudes show the importance of removing 
tl^ese degrees of freedom before attempting to test for relationship be¬ 
tween u and w. The correlation calculated from the error row is a partial 
correlation in which the effects of time and drugs have been eliminated. 

Doubtless it is desirable that we remove also the effects of the de¬ 
pendence of u and w on the variable a: (initial total body weight). This 
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amounts to fitting regressions of u and vonx and correlating the residuals. 
The computation may be regarded as an analysis of covariance according 
to the pattern laid out in Tables XI and XII. 


TABLE XI 

SUMS OF SQUARES AND PRODUCTS 



d.f. 

<«*) 

(utc) 

(tr*) 

(ux) 

(u-x) 

Or 4 ) 

times 

ta 

13269 

801.8 

67.28 

-1736 


264 

drugs 

3 

7931 

-176.6 

16.81 

3451 


2969 

times X drugs 

6 

1529 

182.2 

310.55 

2083 

$66.7 

9053 

error 

38 

10111 

-170.1 

520.19 



38244 


TABLE XII 

SUMS OF SQUARES AND PRODUCTS FOR REGRESSIONS AND DEVIATIONS 
FROM REGRESSIONS 



regression 

deviations 


d.f. 



(tr*) 




(tP») 

error 

1 

5545 

76.2 

1.05 

35 

4566 

-246.3 

519.14 


The correlation coefficient calculated from the residuals is then 

r = _ -246-3 

[(4566)(519.14)] 1/s “ • 1W0 

This number would be tested by entering the table with n = 34. It is, 
of course, far from attaining significance and therefore we find no con¬ 
vincing evidence, of a connection between atrophy and fibrillation. 

The tally calculation in Table XII which has not been encountered 
earlier is that which yields the sum of products of deviations about 
regressions. This number is obtained by combining sums of squares and 
products taken from the error row of Table XI. 

-170.1 - (14562) (200.0)/(38244) = -170.1 - 76.2 = -246.3 

These examples illustrate the common types of analysis in which 
covariance methods are used. Occasionally more elaborate applications 
are required. For example, it may happen that a linear regression is 
incapable of expressing adequately the dependence of one variable on 
another and therefore a curved regression line is required. Procedures 
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appropriate to these unusual applications may be derived without diffi¬ 
culty from the theory of least squares, but no attempt will be made here 
to discuss questions of this kind. 

SECTION II 

The analysis is carried, in Section I, only to the point where standard 
covariance procedures are illustrated. The results obtained there indi¬ 
cate that further examination of the data is called for. Table IX sug¬ 
gests that not all the differences among drugs are accounted for by varia¬ 
tion in initial and final body weights. Furthermore, it may be that final 
body weight is not the best variable to use in making allowance for 
general drug effects, since it is to be expected that final denervated 
muscle weight would be more highly correlated with final intact muscle 
weight than with final body weight. At any rate, it should be worth 
while to repeat the computations which produced Tables VIII and IX, 
using final intact muscle weight in place of final body weight. The 
results are shown in Tables XIII and XIV. 


TABLE XIII 

SUMS OF SQUARES OF DEVIATIONS FROM REGRESSION 



d.f. 

8.S. 

m.s. 

error 

34 

2483 

73 

times 

2 

10132 

5066 

drugs 

3 

480 

160 

times X drugs 

6 

939 

156 


TABLE XIV 

DENERVATED MUSCLE WEIGHTS, ADJUSTED FOR INITIAL BODY WEIGHT AND 
FINAL INTACT MUSCLE WEIGHT 



A 

B 

C 

D 

4 days 

430 

441 

440 

444 

8 days 

360 

347 

374 

312 

12 days 

327 

246 

340 

246 


The error term in Table XIII confirms the guess that intact muscle 
weight is more closely correlated with denervated muscle weight than is 
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final body weight, since the residual error mean square is reduced to 73, 
in contrast with 119 of Table VIII. The “drugs” mean square is seen 
to be not significant with an F value of 2.19. However, Table XIV 
indicates that, when allowance is made for general drug effects, using 
intact muscle weight, the two levels of atropine (A and C) have prac¬ 
tically the same effect, that the effect of quinidine (B) differs little from 
that of saline (D) and that the effect of atropine is appreciably different 
from those of quinidine and saline. This difference is so marked as to 
suggest the possibility that one of the three “drugs” degrees of freedom 
may be reflecting a real difference and ought to be examined by itself. 
• Numerical tests support the decisions reached by inspection of 
Table XTV and indicate also that the average time curve for u. adjusted 
for the values of x and v, is practically linear. It seems reasonable, 
therefore, to direct attention to 3 of the 11 “treatment” degrees of free¬ 
dom, the atropine-saline contrast, symbolized by A + 0 — 2D, the 
linear component of the time curve, t 3 — t 2 , and their interaction, 
(A + C — 2D) — ti). The corresponding mean squares, adjusted by 

covariance for variation in the values of x and v, are 183, 9863 and 508. 
Comparison of these mean squares with the error mean square (73) 
yields the F-values 2.51, 135.03, 6.95, each with 1 and 34 d.f. The 
component A + C — 2D does not attain significance, but its interaction 
with linear time is well above the 5% point. The reason for this is clear. 
Differences between responses to atropine and to saline are small initially 
and increase with time to the point where they become fairly large by the 
end of the twelfth day. These final large differences are masked some¬ 
what in the main effect, since they are considered only in conjunction 
with the mallei; differences of the earlier periods, but emerge sharply in 
the interaction component where they are contrasted with the differences 
at the 4 day period. It appears, therefore, that this interaction compo¬ 
nent is the critical element in the analysis and we are led to the decision 


tire weight of the denervated muscle decreases somewhat more slowly 
when atropine is used than in tire controls. At tins point, we find our¬ 
selves offering tire same conclusions as were reached on the basis of 
Table II and we may well be disturbed here, as wejwere before, at finding 
that those muscle weights which diminished most turn out, after adjust¬ 
ment, to show tire smallest decrease. Certainly, when we give tire ver¬ 
dict “A and C are alike and B and D are alike, but A and C differ from 
B and D”, We are making a statement which obviously is true for the 
since under treatments A and C, intact muscle weight 

t a steady increase. 
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Could it be this fact alone which accounts for the differences among the 
^-values in Table XIV? If so, our method of adjustment must be at 
fault. In any event, it becomes necessary to inquire into the assump¬ 
tions on which the covariance adjustments rest and to find out if they 
are satisfied in this experiment. Section III presents an attempt to test 
some of these assumptions and to search into some of the physiological 
implications of the results of the experiment. 

SECTION III 

Naturally, a covariance adjustment is likely to be misleading unless 
the form of the function used to depict the relationship of the dependent 
to the independent variables (in these examples a linear function) is such 
as to represent this relationship adequately. Usually, with experimental 
data, a linear function can be made to serve reasonably well, by taking 
care, in the planning of the experiment, to avoid too large a range in the 
independent variables. Tests of linearity may be made exactly as in 
ordinary regression analysis and will not be discussed here. 

Another difficulty, more troublesome to deal with, arises from the 
fact that, even though the regressions are linear, the regression coeffi¬ 
cients may vary from one part of the experiment to another. For exam¬ 
ple, the dependence of u on x and v may vary with time or from one drug 
to another. Indeed, it seems not unlikely that variation of this kind 
would be encountered in this experiment, but the covariance adjustments. 
which have been made are based on the assumption that a single set of 
regression coefficients can be used throughout the experiment. 

A test for homogeneity of regression coefficients can be made by 
standard covariance methods [2]. These tests show that the regression 
coefficients are substantially the same for all four drugs, but that they 
tend to decrease in value with increasing time. It seems appropriate, 
therefore, to use different regressions for the three times. * When this is 
done, the numbers in Table XIV are not altered appreciably, so that we 
may accept the findings based bn Tables XIII and XIV and attempt an 
interpretation of then^ 

. An inquiry into the physiology of this situation reveals that little 
meaning can be attached to values of denervated muscle weight adjusted 
for either final intact muscle weight or final body weight, because the 
weight loss of the denervated muscle is composed of two portions, that 
attributable to atrophy and that caused by the drug (or, negatively, by 
growth). - Weight loss due to atrophy is permanent, under the conditions 
of this experiment, whereas weight changes brought about by the drugs * 
or by growth are recoverable. Furthermore, after a muscle fibre has 
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atrophied, its weight is almost entirely unaffected by the a dminis tration 
of drugs or by growth of the animal. Thus, the denervated muscle 
becomes less responsive to changes from these sources as atrophy pro¬ 
gresses. Doubtless this fact accounts for the contrast between drugs 
A. and C and drugs B and D in Table XIV. It appears, then, that for 
an investigation of atrophy, we have been studying the wrong variable 
(denervated muscle weight) and using the wrong statistical model (linear 
regression with constant coefficients of denervated muscle weight on 
initial body weight and final body weight or final intact muscle weight). 
It is clear, too, that we cannot undertake an analysis which will bring 
out the behaviour of atrophy until we have an appropriate model on 
which to base the analysis. 

Apparently not enough quantitative information is available to 
support any specific model. The following speculations are offered 
simply to bring out the nature of the problem. 

Let us suppose that a muscle fibre which atrophies, decreases in 
weight in the ratio of X : 1 and that the ultimate weight of the fibre is 
not affected by the drugs or by growth. Let p, represent the proportion 
of the muscle fibres which have atrophied after a period of denervation 
of duration t. Then this portion of the muscle weighs X«oPi where Wo 
is the initial weight of this muscle. The remaining portion of the muscle, 
which has not atrophied, weighs (1 — pi)«o/(0> where the function /(i) 
is introduced to take account of changes in weight due to drugs, growth 
and possibly other effects also. (No doubt even the functional form of 
f(t) changes from one treatment to another.) Then the weight of the 
denervated muscle at time t is given by 

= X«„p, + «o(l - p,)/(i). 

Similarly, the weight of the intact muscle, at time t, is 

v, = n 

Now, if we equate tio and Co (which should introduce no serious error) 
and eliminate f(t) between the two equations, we obtain 

« 4 = X«op, + (1 - p,)v t . 

The initial weight of the denervated muscle, «o, is not measurable and 
its value varies from one animal to another, but it seems reasonable to 
assume, as we did earlier, that initial denervated muscle weights vary 
linearly with initial body weights, that is, 


«o = a + fee. 
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Also, it seems worth while to try the assumption that p t increases 
linearly with time, since the time range in this experiment is relatively 
short. Thus, writing p t = at and substituting for p t and v® , the equa¬ 
tion may be written 

————^ = —Xaa — Xbax + av t . 
t 

Now the assertion that the various drugs produce different effects on 
the progress of atrophy is equivalent to the statement that p t varies 
from one drug to another. On the other hand, if p t is unaffected by the 
drugs and if all the assumptions which have been made are valid, than 
a is constant throughout the experiment and a single regression equation 
of (v t — u t )/t on x and v t should fit all of the observations reasonably 
well. This'may be tested by applying to z = (v t — u t )/i the same covari¬ 
ance analysis as was conducted on u t in Section II. The results of the 
calculations are given in Tables XV and XVI. (The t values used in 
computing the z 1 s are 1, 2, 3.) 


TABLE XV 

SUMS OF SQUARES OF DEVIATIONS FROM REGRESSION • 



d.f. 

s.s. 

m.s. 

error 

34 

933 

27 

times 

2 

188 

84 

drugs 

3 


38 

times X drugs 

6 

69 

11 


TABLE XVI 

VALUES OP z, UNADJUSTED AND ADJUSTED FOR INITIAL BODY WEIGHT 
AND FINAL INTACT MUSCLE WEIGHT 



n 

B* 

z z' 

C 

z z' 

D 

a z' 

4 days 

11.75 13.38 

14.75 13.13 

9.75 11.00 

19.25 15.05 

8 days 

9.00 16.41 

21.00 17.27 

12.75 14.01 

28.00 21.84 

12 days 

7.50 16.59 

20.50 20.51 

10.50 13.37 

25.25 17,45 


The “drugs” mean square is far from attaining significance; the 
“times” mean square is just below the 5 per cent point, which, even 
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though not significant, may indicate that the assumption that p t = at 
is not too well satisfied. While this test cannot be considered a critical 
test of the effects of the drugs, at least we find nothing to support the 
view that the drugs have had a specific effect on the rate of atrophy. It 
seems likely that a much more elaborate and carefully designed experi¬ 
ment is needed to produce satisfactory evidence on this question. 

Probably some of the discussion which led to this statistical model 
runs counter to facts which are known to physiologists, but this does not 
necessarily imply that the model is wholly incorrect. Alternative state¬ 
ments, more in accord with known facts but of a more complicated 
nature, can lead to the same model. A few specific points, however, 
deserve some comment. 

The statement that, when a muscle fibre atrophies, its weight de¬ 
creases in the constant ratio X : 1 is an unsupported assumption, but at 
any rate, it seems less serious than the same assumption, applied to the 
whole muscle (which was rejected early in this discussion), in view of the 
fact that the whole muscle contains, in addition to muscle fibres, material 
which does not lose weight as a result of denervation. This same point 
comes in again later on, when the weights of the muscles are treated as 
if they are made up entirely of the weights of muscle fibres. The inac¬ 
curacy introduced into the model by neglecting non-museular elements 
may not appear in the final computations. At least, it is unlikely to 
affect the difference and its effects elsewhere may be taken up 

by the regression on initial weights. At any rate, this experiment pro¬ 
vides no information with which this question can be investigated. 
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INTRODUCTION 

T wo experiments were conducted in 1946-1947 by the Agricul¬ 
tural Research Station of the Empire Cotton Growing Corpora¬ 
tion in the Uganda Protectorate of British East Africa to determine the 
optimum planting date for cotton. One experiment was laid out at the 
Kawanda Station, where the main rains occur during the first half of the 
year; the other experiment was laid out at Kabula, where the main rains 
occur during the second half of the year. Since it was suspected that 
interference from the insect, Lygus simonyi, would be a potent factor, 
it was not considered advisable to plant a long series of sowing dates in 
one locality. Lygus is a small capsid bug, which breeds upon grain 
crops planted with the first rains in a given locality; when these early 
crops are harvested, the bug migrates on to newly planted cotton, if such 
is available. There has also been some scanty evidence that where 
young cotton is planted in contiguous plots, the Lygus bugs would tend 
to migrate to the younger cotton. 

In these experiments, it was decided to plant at three successive 
dates, separated by intervals of two weeks (one fortnight), at each of 
eight localities, with a two-thirds overlap between successive localities. 
That is, dates 1, 2 and 3 were to be used at the first locality; dates 2, 3 
and 4 at the second locality; dates 3, 4 and 5 at the third locality; etc. 
This experimental design allowed us to use ten planting dates, spread 
over an 18 week period. Four replications were used at each locality, 
with the three dates at each locality being used in each replicate. Hence 
the yields at each locality could be analyzed on the basis of a 4 X 3 
randomized blocks design. The design and actual plot yields (kilograms 
of seed cotton per plot) of the experiment at the Kawanda Station are 
presented in Table 1. The same design was used at the Kabula Station, 
except that the planting dates ranged from September 13 through 
January 17. , * 
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TABLE 1 

DESIGN AND ACTUAL YIELDS (KG MS/PLOT) OF SEED COTTON FOR A PLANTING 
DATE TRIAL AT THE KAWANDA STATION* 

Dates 
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The problem is to determine the best planting date, and to set up 
some kind of confidence limits on this date. One also might want to 
consider the relative efficiency of this design relative to other designs, 
assuming that the insect interference were actually negligible. For 
example, what could one gain if he planted at four successive dates at 
each locality, giving a three-fourth overlap? Also what is the loss in 
efficiency as compared to a completely balanced design, whereby all 
dates are used at each locality, assuming no insect interference? 

ANALYSIS OF THE KAWANDA DATA 

(a) Analysis of Variance. In order to evaluate the true differences be¬ 
tween the yields at successive planting dates, it is necessary to adjust 
the average yields for the differences among the localities. This necessi¬ 
tates a rather complicated least squares solution of the date and locality 
effects, which we could avoid if there was good evidence of the non-exis¬ 
tence of real locality differences. As a preliminary step, we shall con¬ 
sider the analysis of variance for the Kawanda data. It will be granted 
that we have not demonstrated the non-existence of locality effects even 
though the analysis of variance shows no significant differences; how¬ 
ever, the experimentalist probably would be willing to neglect small 
locality effects in order to forego the necessity of carrying out the least 
squares solution. 

From Table 1 we note that there are 23 degrees of freedom for the 
locality, date and residual (locality-date interaction) constants. Since 
there are 7 degrees of freedom for localities and 9 for dates, there must 
be 7 remaining for the locality-date interaction. The sum of squares 
for these 7 degrees of freedom can be determined from the sum of 7 
independent squares, each representing one degree of freedom. These 
7 independent squares are formed by squaring various linear combina¬ 
tions of the 24 locality-date totals. Let y t j represent the total yield of 
the 4 plots planted at the 2-th date and the j- th locality, where t = 1,2, 

• • • , 10 and j = 1,2, <• • •, 8. A linear combination C can be represented 
as 

(i) o = 2 Hava 

Then °<» is a quantity which can be used to represent the sum of. 

squares with one degree of freedom. In forming the 7 independent 
linear combinations, 7 different sets of a,# are needed. In order that 
any two forms C and C' be independent, 

2 a *i a U = 0 . 
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TABLE 2 

THE COEFFICIENTS TO FORM A SET OF 7 INDEPENDENT* LINEAR COMBINATIONS 
FOR THE LOCALITY-DATE INTERACTIONS 


Linear Combinations* 


Locality 

Date 

Order of 
Planting 

Yield 

1 

2 

3 

4 

5 

6 

7 

1 

2 

2 

■ 

mm 

■ 



+1 


+1 


3 

3 

B 

If 

■ 





-1 

2 

2 

1 

28.67 

1 

■ 



-1 


-1 


3 

2 

27.67 

+i 

■ ■ 



-3 


—3 


4 

3 

26.51 


I 



4 


4 

3 

3 

I 

23.11 


1 1 



4 


4 


4 

2 

22.84 


H; 3 



-3 


11 


5 

3 

17.66 


El 



-1 


-15 

4 

4 

1 

10.43 


H 



-I 


-15 


5 

2 

8.58 


+i 



1 


-41 


6 

3 

8.06 







56 

5 

5 

1 

17.68 







56 


6 

2 

18.27 



• +1 



1 

-41 


7 

3 

14.60 



-1 



-1 

-15 

6 

6 

1 

17.47 



■ 



-1 

-15 


7 

2 

15.94 



EH 



—3 

11 


8 

3 

12.59 






4 

4, 

7 

7 

1 

24.39 



■ 



4 

4 


8 

2 

20.86 



HI 

+1 


-3 

-3 


9 

3 

17.40 



I 

-1 


-1 

-1 

8 

8 

1 

18.55 



Hi 

-1 


-1 

-I 


9 

2 

15.02 



1 

+1 


+1 

+1 

Toted (O 




1.01 

3.33 

2.14 

—0.07 

0.78 

2.79 

17.99 

Divisor (D) 



16 

16 

16 

16 

224 

224 

43,456 

'&/P : /' \ '! ' . 


1.0433 


0.0375 

0.0074 


♦Blank ap&cee represent 0 ooefficieaita. Each field is fcke total field of 4 plots at the Ka wanda 
Stetteen. v 


la addition, these combinations mast be independent of date and 
locality effects. In order to fulfill this condition, the a if of a given C 
for any date or locality must also sum to 0. 

One set of 7 independent linear combinations for the residual or 
interaction sum of squares is given in Table 2, The first four combina¬ 
tions represent first-order interaction effects. For e xample consider 

* (2) Ci — (jfej — y %0 (j/u — |fe*) — y a i — y 3l — -f . 

0*$ i-ptot of the right-hand jade measures the difference between the 
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yields at dates 2 and 3 but at two different locations* 1 and 2. Hence 
the difference between these two parts measures the change in the date 
effect from one location to another, which we designate as the date- 
location interaction. We note that C x is independent of date and locality 
effects, since the sum of the coefficients for each date and locality is 
zero. Also the first four combinations are obviously independent, since 
they have no plots in common. 


TABLE 3 

ANALYSIS OF VARIANCE FOR THE KAWANDA DATA 


Source 

Degrees of Freedom 

Sum of Squares 

Mean Square 

Interaction 

7 

1.0882 

.1555 

Localities 

7 

201.1345 


Dates (adj.) 

9 

21.7354 

2.4150** 

Dates 

9 

45.2186 


Localities (adj.) 

7 

177.6513 

25.3787** 

Replications 

24 

11.6219 

0.4842 

Error 

48 

17.3125 

0.3606 


♦♦Significant at the 1% probability level. 


In this experiment these first four combinations also represent some 
effect of the order of planting on yield (and consequently of Lygus migra¬ 
tion from the early to the late plantings). The effect of the Lygus 
migration from the early to the late plantings is a quadratic effect, 
since the sums ((7) are sums of single yields at the first and third plant¬ 
ings subtracted from two yields at the second planting. The sum of the 
first four combinations could be used to measure this quadratic effect 
of the order of planting* with one degree of freedom, and the remaining 
three degrees of freedom would then represent the interaction effect, 
adjusted for order of planting. The sum of squares for the order of 
planting would be (6.41) 2 /4(16) = .6420, leaving 0.4013 for the three 
interaction degrees of freedom. None of these effects is significantly 
different from zero, using the error variance given in Table 3. Since 
there is obviously no date-location interaction, we can use each of 
the first four C ’s as a measure of the effect of order of planting. The' 
standard error of each of these C’s is 2.40, indicating that none of them 
is significantly different from zero. It is possible to compute a separate 
error term for each C with 6 degrees of freedom, but this does not alter 
any of the above conclusions. 
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Combinations 5 and 6 are each based on a comparison of 4 dates and 
4 localities. For example 

C s = y n - 2/ 2 2 — 2/3i ~ 32/ 3 2 + 

(3) 

"I" 4?/42 32/48 2/44 2/53 4" 2/84 • 

Again the sum of the coefficients for any date or locality is 0. In addition 
the sum of the products of these coefficients with those for any of the 
first 4 combinations is 0. Combination 7 is an over-all comparison, which 
was constructed so as to be independent of all the other combinations 
as well as dates and localities. 

The divisors are the denominators, 4^ a ti , mentioned above. For 
the first 4 combinations the divisors are 4(1 + 1 + 1 + 1) = 16. For 
the next two, the divisors are 2[4(1 + 1 + 1 + 9 + 16)] = 224. And 
finally for the last comparison, the divisor is 4(10864) = 43456. The 
combination totals and contributions to the residual sum of squares are' 
given at the bottom of Table 2, and SSR = 1.0882. 

Next we compute the total sum of squares for the 23 degrees of 
freedom (SST) and the sum of squares for the unadjusted date effects 
( SSD ), and the sum of squares for the unadjusted locality effects (SSL). 
These are computed as follows: 

Correction for mean = C = (407.60)796 = 1730.6017 
SST = [(9.72) 2 + (10.70) 2 + • • • + (12.19)*3/4 - C 

= 223.9581 

(4) SSL = K29.ll) 1 + (82.S5) 2 + •••*+ (45.76) a ]/12 - C 

= 201.1345 


SSD 


(9.72) a (39.33) a (59.47) 2 

4 + 8 + 12 + ''' + 


(12.19) 2 

4 


- C = 45.2186 

In order to make exact tests of significance, we require the sum of 
squares for localities adjusted for dates [SiSL(adj.)] and the sum of 
squares for dates adjusted for localities [&SD(adj.)]. A simple identity 
can be used to evaluate these adjusted sums of squares: 
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SSL + SSD(adj.) = SSD + SSL( adj.) 

(5) 

= SST - = 222.8699 

IJence 

S/SZ)(adj.) = 222.8699 - 201.1345 = 21.7354 

( 6 ) 

SSL( adj.) = 222.8699 - 45.2186 = 177.6513. 

Finally the error sum of squares (SSE) is given by the replication by 
date sum of squares summed over the 8 locations, with 3 X 2 X 8 — 48 
degrees of freedom. Because of the large locality differences, it might 
be suspected that the error variance would not be the same for all loca¬ 
tions. Bartlett's x. test for the homogeneity of variance was applied, 
giving x = 7.364 with 7 degrees of freedom. The probability of obtain¬ 
ing this or a larger value of x 2 , assuming equal error variances at all 
localities, is about .40, indicating that it is quite reasonable to assume 
equal error variances at all localities and to pool the individual estimates 
as mentioned above. It might be argued that we should also pool the 
location X date interaction with the error variance, giving a total of 55 
error degrees of freedom. We have not done this in the analysis which 
follows, because the additional degrees of freedom were not thought 
necessary in this case. 

The complete analysis of variance is given in Table 3. 

This analysis indicates that there are highly significant differences 
among both the dates and localities. Since there are real differences 
between the mean yields for different localities, the mean yields at each 
planting date should be adjusted for locality effects. Two types of 
analysis are suggested at this point. Either we determine the adjusted 
average yield for each planting date, or we fit a regression curve of some 
kind to the adjusted average yields. 

(b) Average Yields for Each Planting Date, Adjusted for Locality Effects. 

We can represent the total yield y ti of the 4 plots at a given date, t, 
and locality, j, as follows: 

(7) y t i = 4(m + d t + Z,-) + r tf , 

where m is the general mean, d t is the effect of the Z-th planting date 
(t = 1,2, — , 10) and Z, : is the effect of the;-th locality (j = 1, 2, • • • , 8), 
r u is the residual after accounting for the effects of the constants— 
m, d t , and Z,-. It should be noted that we have multiplied the right-hand 
side by 4 in order to put the results on a per-plot basis. 
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The constants will be estimated by the method of least squares.' 
First we set up the error equation 

(8) SSR = £ [y, } - 4(m + d, + h )?/4 

where SSR indicates the interaction or residual sum of squares, the sum 
" of the 24 squared residuals, r?,/4. The constants are estimated by 
minim izing SSR with respect to each of the constants. For d 1 , we set 


dSSR 
dd 


This gives 

(10) £ [y u - 4(m + d 1 + Z,)] = 0, 

t /..y. 

• • '' -y.i. •<. 

where the summation is made over only the one locality (4 plots) having 
the first planting date. Equation (10) simplifies to 

(11) E Vu = D 1 = 4d 1 + 4 k + 4m 

i 

Similar equations can be obtained for each d, . 

Similarly if we minimize SSR with respect to 1 %, we have 

(12) £ y n — Li = 4(di + ch + dz) + 12Zi + 12m. 

* » 

The equation for m is 

E V> i = G — 4di + 8da + 12(dj + • • • + d a ) + 8 d# + 4di 0 

. (IB) ~ 

+ 12(lt + • • • + lg) + 96m 

The coefficients of all 19 least squares equations are presented in 
Table 4. The reduced equations, after eliminating the locality constants, 
are given at the bottom of Table 4. In this table the yield totals, ad¬ 
justed feu- localities, aze indicated by A, . The G equation indicates 
'that if m isto be the general mean, we should assume that£,- i, = 0 and 
that _ 

(14) [di + diz + 2(d* + d») + 3(dg + • + ds)3 = 0. 

In artier to obtain standard errors to test for the difference between 
the adjusted average yields at any two dates, it is necessary to invert 
the coefficient matrix for the adjusted date effects. The inverted matrix 

ible 5. . The elements of this matrix 
mu , where t stands for the row Mid 
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TABLE 4 

COEFFICIENTS OF LEAST SQUARES EQUATIONS FOR ESTIMATION OF PLANTING 

DATE MEANS 


Independent Variables 


• ' 

Dates 0 d t ) 

Localities (?,•) 


Plot Totals* 

123456789 10 

1 2 3 4 5 6 7 8 

m 

Di : 9.72 

4 

4 

4 

Dt : 39.37 

8 

4 4 

8 

Di : 59.47 

12 

4 4 4 

12 

Di : 59.78 

12 

4 4 4 

12 

Di : 43.92 

12 

4 4 4 

12 

Dt : 43.80 

12 

4 4 4 

12 

Dt : 54.93 

12 

4 4 4 

12 

D 8 : 52.00 

12 

4 4 4 

12 

D 9 : 32.42 ! 

8 

4 4 

8 

Dio : 12.19 

4 

4 

4 



.. * 


Li ; 29.11 

4 4 4 

12 

12 

Li ! 82.8o 

4 4 4 

12 

12 

Li : 63.61 

4 4 4 

12 

12 

Li : 27.07 ' 

4 4 4 

12 

12 

L t : 50.55 

4 4 4 

12 

12 

Li : 46.00 

j 4 4 4 

12 

12 

Lt : 62.65 

4 4 4 

12 

12 

Li : 45.76 

4 4 4 

12 

12 

O : 407.60 

4 8 12 12 12 12 12 12 8 4 

12 12 12 12 12 12 12 12 

96 


EQUATIONS FOR DATE EFFECTS ADJUSTED FOR LOCALITIES 


Adjusted Totals* 

Coefficients of Adjusted Date Effects ( d t ) 

1 2 3 4 5 6 7 8 9 10 

■I 

mm 

-Li 


0.05 

8 -4 -4 

Mml 

SSI 

-Li -Lt 

SB 

6.15 

-4 16 -8 —4 

:.al 

BH 

—In. — Lt — Li 

SB 

2,84 

-4 -8 24 -8 -4 

Ai 

- 3D* 

— Lt — Lt — Li 

-B 

5.81 

-4 -8 24 -8 -4 

Ai 

- 3 Dt 

— Lt — Li — Lt 

— 

-9,47 

-4 -8 24 -8 -4 

At 

- 3Ds 

— Li — Lt — Lt 

BS 

7.78 

-4 -8 24 -8 -4 

At 

-3Dt 

—Li — Lt — Lt 

■s 

5.59 

-4 -8 24 -8 -4 

At 

»3D 8 

— Lt — Li — Lt 

SB. 

1.59 

-4 -8 24 -8 -4 

At 

■ 3D g 

— Li — Lt 

- , 

-11.15 

-4 -8 18 —4 

Aio 

* 3Dio 

- Lt 


-9.19 

-4 —4 8 


♦Plot Totals at Kawanda. 


j the column number) can be used to deter m ine both the values of the 
adjusted date effects and their variances. Before inverting the 10-row 
coefficient matrix at the bottom of Table 4, it is noted that the coeffi¬ 
cients always sftm to 0 for a given row or column. Such a matrix is 
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TABLE 5 

INVERSE iM ti ) MATRIX FOR ADJUSTED DATE EFFECTS* 

y 



1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

1 

.48884 










2 


.40550 









3 

.35268 

.34450 

.36086 








4 

.30580 

.30799 

.30362 

.31892 







*5 

.26562 

.26504 

.26622 


.27744 






6 

.22322 

.22337 

mzm 

.22414 


.23503 





7 

.18304 

.18299 

.18308 

.18276 

.18396 

.17949 

.19615 




8 

.13616 

.13617 

.13615 

.13621 

.13601 

.13675 

. 13398 

.14434 



9 

.11384 

.11383 

.11385 

.11379 

.11399 

.11325 

.11602 

.10566 



10 

0 

0 

0 

0 


0 

0 

0 



A,** 

j* 

6.15 

2.84 

5.81 

-9.47 

7.78 

5.59 

1.59 

-11.15 

-9.19 



4.478 

4.110 

3.856 

2.981 


2.387 

1.598 

0.699 

0 

(d t +■») 

5.691 

5.869 

5.501 

5.247 

4.372 

4.413 

3.778 

2.989 

2.090 

1.391 

*W t ) 

0.5288 

0.4387 

0.3904 

0.3450 



0.2122 

0.1561 

0.1740 

0 


0.7272 

0.6623 

0.6248 

0.5874 

0.5478 

0.5043 

0.4607 

0.3951 

0.4171 

0 


♦Since the matrix is symmetrical, only the lower left section is reproduced here. 
**Data from the Kawanda experiment. 


called a singular matrix and cannot be inverted. This difficulty could 
have been taken care of by dropping the last equation and eliminating 
di o by use of equation (14). However, we decided on a simpler proce¬ 
dure, namely to assume d 10 = 0, and hence drop the 10-th row and col¬ 
umn from the matrix. Then we merely inserted a column and a row of 
zeros in the inverted matrix. We have indicated the date effects under 
this assumption as d[ . 

For those not accustomed to the matrix notation, we might add that 
a matrix is simply a two-way array of figures such as the coefficients in 
Table 4. We are here pealing with symmetrical matrices. An inverted 
symmetrical matrix consists of another two-way array which has the 
following property: If you multiply the elements of any row (or column) 
of the original symmetrical matrix by the corresponding elements of the 
same row (or column) of the inverted matrix, the sum of the products 
is unity; if you multiply the corresponding elements of different rows or 
columns, the sum of the products is zero. 

The values of the adjusted date effects are then computed from the 

M and A values as follows: 

% 

10 

(15) d', = 2 MttAj 


t 
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For example, d[ = [(0.48884)(0.05) + •*• + (0.11384) (-11.15)] = 
4.3005. Note that all M t , i 0 values are 0. The complete set of d' t values 
is presented in Table 5. 

In order to obtain the original d t values, we subtract a constant A* 
from each d f t so that the new values d' t — k = d t fulfill equation (14). 
We see that 

(16) k = [dl + 2dZ + 3(d£ + • ■ • + di) + 2 d' Q + diol/24 = 2.855. 

In most cases, it is desired to replace the date effects by adjusted yields 
at each date by adding m (= 4.246) to each of the d t . Hence we can 
combine both steps by adding 

(17) m - k - 4.246 - 2.855 - 1.391 

to each of the d [. These adjusted yields for each date ( d t + m) are also 
given in Table 5. 

The variance of any adjusted date effect (d' t ) is simply 
where <r 2 is estimated by the error variance, 0.3606. The factor, 3, is 
required because the D t were multiplied by 3 in determining the A t . 
The variance of d [, for example, is 

3(0.48884)(0.3606) - 0.5288 

The variances and standard errors of d\ are also given in Table 5. 

However, we are generally more interested in the variances of the 
differences between the adjusted mean yields for successive dates. For 
example, the difference between the average yields for the first two dates 
is 0.18. The variance of the difference between the adjusted mean yields 
at two dates, i and k is given by the formula 

(18) S(M U - 2 M ik + M**]* 2 

where a 2 is the error variance, estimated to be 0.3606. Hence the vari¬ 
ance of the difference between the average yields for the first two dates is 

3(0.48884 - 0.75000 + 0.40550)(0.3606) = (0.43301)(0.3606) = 0.1561.* 

Hence we conclude that there was no significant difference between the 
two adjusted mean yields. The coefficients of the error variance (for 
example, 0.43301) are given in Table 6. 

Some of the differences between adjusted mean yields for successive 
dates and their standard errors are given in the second half of Table 6 
on the next page. 
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TABLE 6 

COEFFICIENTS OF VARIANCES OF DIFFERENCES BETWEEN ADJUSTED DATE MEANS 

Date 




From these averages, we conclude that: (i) The optimum planting 
date is somewhere near the first planting date (May 1). This optimum 
is poorly determined, as shown by the fact that there is no significant 
decrease in the adjusted yield until the fifth planting date (June 26). 
(ii) There is an unaccountable plateau at the fifth and sixth planting 
dates. Except for this, after the fourth date there appears to be a gen¬ 
eral significant decrease in yield between successive planting dates until 
we reach the last date (September 4). There is a tendency to flatten out 
at this last date, as expected, because the yields cannot fail below zero. 

(c) Determination of die regression, of cotton yields (adjusted for locality 
differences) on planting date. An examination of the adjusted mean 
yields (d c + m).in Table 5 suggests that a regression equation of the 
following type should be used: 

20) Vti — 4[s + b(t — i) + c(i — "if -f- ij] + p i# , r 
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‘where t is the mean time period (= 5.5) and p tj is the residual after 
accounting for the parabolic trend and the effect of the j-th locality, 
a, 6 and c are the coefficients of the regression curve, to be estimated 
from the data. As before, we shall assume that j h = 0. 

The constants are estimated by least-squares. For example, the 
least-squares equation for b is 

(21) Z iy«t' ~ 4 (at' + bt' 2 + ct' 3 + IM = 0; 

where the summation extends over all plots, t f = t — t, and t\ implies 
the values of t f used at the j-th locality. Since Y ^ = 0 over 

all plots, equation (21) simplifies as follows: 

Y yt f - 5686 - (42Z X + 30Z 2 + 18^ + 6Z 4 - 6Z 5 - 18Z« 

( 22 ) 

- 30Z 7 - 42Z S ). 

TABLE 7 

COEFFICIENTS OF LEAST SQUARES EQUATIONS FOR ESTIMATION OF OPTIMUM 
PLANTING DATE OF COTTON 


Independent Variables 


Plot Totals* 

h 

h 

U 

U 

h 

u 

' h 

h 

a 

b 

c 

Li 

29.11 

12 

0 

0 

0 

0 

0 

0 

0 

12 

-42 

155 

La 

82.85 

0 

12 

0 

0 

0 

0 

0 

0 

12 

-30 

83 

Lz 

63.61 

0 

0 

12 


0 

0 

0 

0 

12 

-18 

35 

Li 

27.07 

0 

0 

0 

12 

0 

0 

0 

0 

12 

- 6 

11 

Lt 

50.55 

0 

0 

0 

0 

12 

0 

0 

0 

12 

6 

11 

L* 

46.00 

0 

, 0 

0 

0 

0 

12 

0 

0 

12 

18 

35 

Zz 7 

62,65 

0 

0 

0 


0 

0 j 

12 1 

0 

12 

30 

83 

Lz 

45.76 

0 

0 

0 

0 

0 

0 

0 

12 

12 

42 

155 

G 

407.60* 

12 

12 

12 

12 

12 

12 

12 

12 

96 

0 

568 

B 

-39.22* 

-42 

-30 

-18 

- 6 

6 

18 

30 

42 

0 

568 

0 

C 

2299.62* 

155 

83 

35 

11 

11 

35 

83 

155 

568 

0 

6742 


Solution for b and c 

(1) B + <7Ia + 51* 4 *ZLt + Li - U - ZU - 5L 7 - 7L*)/2 - -32,32 - 646 

(2) C - (1551a + 831* 4- 35L« + 111/4 + llXi + 3 5L« + 83L 7 + 155Ls)/12 - -64.68 - 4096c/3 

(3) G * 96a + 568c 

*G *** Grand Total; B ■* E y(t — I); C =» E y(t — 7)*; Plot totals at Kawands. 

The coefficients for all of the equations and the requisite data from 
the Kawanda experiment are given in Table 7. If we multiply the L 
equations by the appropriate constants, as shown at the bottom of 
Table 7, and add to the 6 and c equations, we have at once the equations 
to estimate the values of 6 and c adjusted for locality effects, 

(23) 6 = -0.5050, c = -0.04737. 

And then a = y — 568c/96 ='4.5261. 
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From these final equations, we also see that b and c are independent 
and that the variances of b and c are 

(24) a\b) = <r 2 /64; * 2 (c) = 3*74096, 

where * 2 is estimated from the error variance, in this case 0.3606. Hence 
the estimated variances and standard errors of b and c are 

s\b) = 0.005634; s(b) = 0.07506 ' . 

(25) 

s\c) = 0.0002641; s(c) = 0.01625 

These results indicate that both b and c are significantly different 
from 0. 

A comparison of the adjusted mean yields based on the quadratic 
regression equation (20) and on the original equation (7) (Table 5) is 
given below. 


ADJUSTED MEAN YIELD 


Date 

1 

(7) 

(20) 

Deviation 

1 

5.691 

5.839 

-.148 

2 

5.869 

5.713 

.156 

3 

5.501 

5.492 

.009 

4 

5.247 

5.177 

.070 

(26) 5 

4.372 

4.767 

-.395 

6 

4.413 

4.262 

.151 

7 

3.778 

3.662 

.116 

8 

2.989 

2.968 ; 

.021 

9 

2.090 

2.178 

-.088 

10 

1.391 

1.294 

.097 

Average* 

4.247 

4.247 

.000 


*The Average is eompuied as | di + 2 (d* + d») -J- 3(d* + - de)]/24> 


From (26), we see that there is bo pronounced trend in the deviations, 
such as consistent positive or negative deviations at the early and late 
dates or in the middle. The only large deviation is at the fifth date, for 
which we previously noted the unexplained sharp drop. The chief 
difference between the two series of adjusted yields is that the maximum 
point is at the first date using the quadratic regression as compared to 
the second date for the original adjusted yields. 

It might be advisable to check the adequacy of our quadratic predic¬ 
tion equation in estimating the adjusted yields. We find that the reduc¬ 
tion in total sum of squares due to the quadratic regression is given by 
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(27) (-32.32X—0.5050) + (-64.6758)(-0.04737) = 19.3853. 

Hence the remaining sum of squares for the other 7 degrees of freedom 
for dates is 


(28) 21.7354 - 19.3853 = 2.3501 


This mean square is 2.3501/T = 0.3357, which is even less than the error 
mean square. 

The planting date to give the maximum yield can be estimated by 
differentiating the estimating quadratic equation (20) with respect to 
t and equating the result to 0. This gives as a result the maximum 
planting date 1 

(29) L. = t ~ b/2c = 0.17 

The variance of this estimate can only be approximated. If only the 
first order terms of the Taylor expansion of the differential of b/2c is 
used, we find that 


a\b/2c) = 


Ab) 

4c 2 


+ 


bV(c) 

4c 4 



“ ( 2 c) U + 4096c 2 ]*’ 
where <r 2 is estimated by s 2 . Hence 

(31) s 2 (b/2c) - 3.994; s(b/2c) = 2.00. 

The standard error of tm»x is the same as s(b/2c), because t is fixed. 
Hence the maximum planting date db two standard errors is given by 
0.17 ± 4.00. As expected, the confidence interval is very large, indi¬ 
cating that the difference between the two maximum points in (26) is 
unimportant. The variance of the estimate of the optimum planting is 
large for two reasons: (i) There is a serious loss of information in making 
the adjustments for locality differences. For example, the coefficient of 
b is reduced from 568. to 64 in making the adjustments (Table 7), (ii) 
The optimum point comes before or near the first planting date used in 
the experiment. If the optimum planting date were near t, so that b 
would be small, s 2 (b/2o) would be materially reduced. 

(d) Effects of Lygus Infestation. Unfortunately this design does not 
furnish any means of determining if there is any linear decrease in the 


U'his is a m axi m um only if c is negative. 
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yield for the last planting at a given locality because of a Lygus migration 
from the earlier planted plots. If we include a constant in equation (20) 
to represent the difference between the yields of the first and third plant¬ 
ings at all localities, this constant proves to be the same as the linear 
regression coefficient, 6, of yield on planting date, adjusted for localities. 
As indicated before, the quadratic effect of the Lygus migration can be 
measured by taking the differences between the yield for the second 
planting and the average of the yields for the first and third plantings. 
This quadratic effect was shown to be non-significant. 


TABLE 8 

INDEX OF LYGUS DAMAGE PER COTTON PLANT FOR THE KAWANDA EXPERIMENT 


/ 

Locality 

1 

2 

3 

4 

Date 

1 

2 

3 

2 

3 

4 

3 

4 

5 

4 

5 

6 


2.16 

1.16 

2.89 

4.02 

2.00 

1.54 

3.66 

2.17 

1.46 

1.78 

2.13 

2.47 


2.49 

1.69 

2.22 

3.20 

3.35 

1.65 

3.64 

1.91 

1.56 

3.59 

3.04 

2.27 


1.71 

1.60 

2.89 

3.69 

4.11 

1.09 

3.84 

1.52 

1.56 

2.17 

2.87 

2.76 


1.62 

1.33 

1.67 

3.00 

4.55 

2.50 

3.73 

2.11 

1.53 

2.26 

2.31 

3.20 

Total 

7.98 

5.78 

9.67 

13.91 

14.01 

6.78 

14.87 

7.71 

6.11 

9.80 

10.35 

10.70 


23.43 

34.70 

28.69 

30.85 


Locality 

5 

6 

7 

8 

Date 

S 

6 

7 

6 

7 

8 

7 

8 

9 

8 

9 

10 


3.47 

4.24 

2.34 

3.16 

3.25 

2.04 

2.62 

2.37 

0.78 

1.71 

0.89 

1.16 


8.58 

3.89 

2.81 

3.85 

2.05 

1.67 

1.51 

2.39 

1.11 

.2.46 

1.49 

1.51 


3 ,U 

4.26 

2.0G 

3.78 

fl'M 

1.73 

2.20 

2.04 

0.82 

1.89 

2.40 

1.16 


4.96 

3.69 

3.07 

4.00 

2.16 

2.02 

1.62 

2.04 

1.20 

2.60 

1.13 

0.71 

Total 

15.25 

16.08 

9.72 

14.79 

11.61 

7.48 

7.95 

8.84 

3.91 

8.66 

6.91 

4.54 


41.05 

33.76 

20.70 

19.11 


In order to check if there was an adverse effect on Use cotton yield 
by Lygus migration, an index of the Lygus damage per plant was de¬ 
termined foreadhpk>t of the Kawanda experiment. The Lygus damage 
data are given in Table 8. Using these data, the Cotton yields (F) were 
adfusted for the Lygus^damage (X) in the following analysis of cbvari- 
mbe, where# = X — X and y » Y — Y m 
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TABLE 9 


ANALYSIS OF COVARIANCE FOR THE KAWANDA EXPERIMENT 


Source 

Degrees of 
Freedom 

Sy 3 



Interaction 

7 

1.0882 

0.4357 

3.0523 

Dates (adj.) 

9 

21.7354 

19.8782 

36.7967 

Localities (adj.) 

7 

177.6513 

29.7493 

27.0614 

Replications 

24 

11.6219 

-2.5228 

5.4713 

Error 

48 

17.3125 

-2.3957 

14.0422 

Error 4- Dates (adj.) 

57 

39.0479 

17.4825 

50.8389 

Error + Loc. (adj.) 

55 

194.9638 

27.3536 

41.1036 


Error 

47 

16.9038 

.3597 

Error + Dates (adj.) 

56 

33.0360 


Dates (adj.) 

9 

16.1322 

1.7925** 

Error + Loc. (adj.) 

54 

176.7605 


Localities (adj.) 

7 

159.8567 

22.8367** 


♦♦Significant at the 1 % probability level. 


From this analysis, we conclude that there was no significant over-all 
regression of yield on Lygus damage, as shown by the non-significant 
reduction in the error sum of squares by the use of X-variate (Lygus 
damage). Hence we conclude that there was no serious insect migration 
to the last-planted cotton at a given locality, when only three planting 
dates were used at each locality. 

The regression coefficient, b Y . x , is found from the error row in the 
above table: 

(32) = ifiir = -°- 1706 - 

Each yield figure could be adjusted for the Lygus damage by use of the 
equation 

(33) readjusted) = 7 + 0.1706(X - X). 

Then the optimum planting date could be determined from these 
adjusted yields by use of the methods given above. 

It would not be useful in general to bother with such a small adjust¬ 
ment, but we will illustrate the method which would be used. In Table 
7, the locality yield totals (adjusted for Lygus damage) are 

(34) 28.15, 83.82, 63.55, 27.38, 52.60, 46.81, 61.23, 44.07. 
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Similarly the adjusted values of B and C are — 51.46 and 2261.44. Using 
these adjusted values, we obtain 

(35) b = -0.5965; c = -0.05113. 

The variances of these estimates will be the same as before except that <r 2 
isestimated as the average variance of Y — b y .j(X — X). 2 This average 
variance is 0.3597(1 + 1/96) = 0.3634. There is no significant change 
in the value of either 6 or c. 


ANALYSIS OF THE KABULA DATA 

As stated previously, the same experimental design was used at the 
Kabula Experiment Station, except that the planting dates ranged from 
September 13 through January 17. The average yield for this experi¬ 
ment was 4.463 kgms./plot. No data were gathered on the Lygus 
damage. The analysis of variance for the Kabula data is given in 
Table 10. 


TABLE 10 


ANALYSIS OF VARIANCE FOR THE KABULA COTTON YIELD DATA 


Source 

Degrees of Freedom 

Mean Square 

Interaction 

7 

0.7999 

Dates (adj.) 

9 

3.6573** 

Localities (adj.) 

7 

26.5159** 

Replications 

24 

1.6730 

Error 

48 

1.1448 


**Signjfioant at the 1% significance level. 


The unadjusted mean yields per plot for each planting date at 
Kabula, the mean yields per plot adjusted for localities by equation (7) 
and the estimated adjusted mean yields per plot based on the quadratic 
regression equation (20) are presented in Table 11. 


*This is only an approximation to the best estimate of &*. The correct estimate would involve 
weightin g each observation according to its use in the estimation of b and c. 
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TABLE 11 


MEAN COTTON YIELDS FOR EACH PLANTING DATE AT KABULA 


Date 

■ 

Unadjusted 
Mean Yield 

Mean Yields Adjusted for Localities 
| Equation (7) Equation (20) Equation (38) 

1 (9-13) 

-4.5 

6.498 

6.808 

7.279 

6.665 

2 (9-27) 

-3.5 

6.412 

6.997 

6.479 

7.151 

3 (10-11) 

-2.5 

6.018 

6.631 

5.751 

6.525 

4 (10-25) 

-1.5 

4.816 

5.661 

5.096 

5.428 

5 (11-8) 

-0.5 

2.544 

3.827 

4.514 

4.325 

6 (11-22) 

. - 0.5 

2.362 

3.774 

4.003 

3.503 

7 (12-6) 

1.5 

3.466 

3.065 

3.566 

3.077 

8 (12-20) 

2.5 

4.611 

2.939 

3.200 

2.983 

9 (1-3) 

3.5 

6.033 

2.996 

2.907 

2.985 

10 (1-17) 

4.5 

4.286 

2.644 

2.686 

2.666 

Average 

4.463 

4.463 

4.463 

4.463 


Again we note that there is no significant decrease in the adjusted 
mean yield until we reach the fourth or fifth planting date. Some of the 
differences between the adjusted mean yields, using equation (7), and 
their standard errors are given below. 


Date 

Mean Difference 

Standard Error 


2-4 

1.336 

.610 

(36> 

4-5 

1.834 

.498 


5-10 

1.183 

.976 


It would appear that the optimum planting date is somewhere between 
September 13 and October 11. 

When the quadratic trend was fitted to the mean yields at the suc¬ 
cessive dates by equation (20), we obtained the following estimates of 
b and c db their standard errors: 

b - -0.5103 db 0.1338 
(37) 

c = 0.0362 db 0.0290 

We note that only the linear coefficient, 6, is significantly different 
from 0. Also since b and c are of opposite signs, this regression curve will 
be concave upwards, indicating a minimum rather than a maximum at 
the point: t = t — b /2c. Hence no optimum planting date can be esti¬ 
mated by use of the quadratic equation. Extrapolation would be 
impossible because the adjusted means estimated by the regression will 
increase on both tails when c is positive. The minimum point is not 
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reached until t = 12.55; hence, Table 11 does not indicate the increasing 
yields at the later dates. However, we do see that the regression esti¬ 
mate is beginning to diverge quite noticeably at the first date. 

It might be useful to investigate the possibility that equation (20) 
should be changed to include third and fourth degree terms, as follows: 

(38) y l} = 4[a + W + ct' 2 + dt ,z + ei' 4 ] + p ti . 

The degree must be even if we are to have the downward trend at both 
ends. The estimates of 6 and d (adjusted for localities) are independent 
of c and e. However 5 and d are correlated with each other and so are 
c and e. 

The estimating equations for b and d are: 

645 + 1072d = -32.658 

(39) 

10725 + 31060d = -299.868 
The inverse of the coefficient matrix is 

' .03703545 -.00127824 

(40) 

- .00127824 .0000763126 

The estimated values of 5 and d ± their standard errors are: 

5 = -0.82622 ± 0.2059 

(41) 

d = 0.018862 ± 0.00935. 

Hence we conclude that both 5 and d are significantly different from 0. 
Similarly for c and e, the estimating equations are: 

(4,096c + 84,736e)/3 = 49.441 

(42) 

(84,736c + 2,038,528e)/3 = 326.086. 

The inverse of the (c, e) coefficient matrix is: 

.00522869 -.000217342 ' 

-.000217342 . .0000105060. 

Hence the estimated values of c and e with their standard errors are: 
c = .18764 ± .07739 

m) * 



«■= -.0073197 ± .003467 
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The adjusted mean yields estimated by the quartic regression curve 
(38) are also given in Table 11. The agreement between these estimated 
mean yields and those estimated from equation (7) is quite good. The 
estimated mean yields (by quartic regression) are begimling to drop off 
at both ends, and there are no pronounced series of plus or minus devia¬ 
tions as with the estimated mean yields found by using only the quad¬ 
ratic regression. The remaining date sum of squares after fitting equa¬ 
tion (38) is 4.6986 with 5 degrees of freedom, giving a mean square of 
0.9397, which is smaller than the error mean square. 

The problem of estimating the optimum planting date is complicated 
by the fact that there are two maximum points and one minimum point. 
These points are solutions of the equation 

(45) b + 2ct' + 3 dt' 2 + 4e2' 3 - 0. 

The maximum point with the largest average adjusted yield is at t' = 
—3.665, which is slightly before the second date (September 27). No 
attempt has been made to estimate the standard error of this estimated 
optimum planting date. 

RELATIVE EFFICIENCY OF THIS EXPERIMENTAL DESIGN 

If there were no adverse effects from the Lygus migration to the 
newly planted cotton, it would be advisable to use as many planting dates 
as possible at each locality. The analysis of the Kawanda data seems 
to indicate that more than 3 successive planting dates might have been 
used at a given locality without incurring any'serious insect interference. 
We shah consider the relative efficiency of the following field designs: 

(i) The present “staggered” 3-date plan. 

(ii) A staggered 4-date plan, with only 6 locations as follows: 


Location 

Dates 

Location 

Dates 

1 

1,2,3,4 

4 

5,6,7,8 

2 

2,3,4,5 

5 

6,7,8,9 

3 

3,4,5,6 

6 , 

7,8,9,10 


(Notice that the sequence (4, 5, 6, 7) is omitted in order to 
have a design which would have the same number of repli¬ 
cations per date as for the 3-date plan.) 

(iii) A balanced incomplete blocks design with 3 planting 
dates per block, which requires 30 blocks with each plant¬ 
ing date being replicated 9 times. (The same number of 
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plots could not be planted at each location if this design 
were used,) 

(iv) A complete blocks design with all 10 planting dates at each 
locality, the locality being a complete block. 

Under designs (i) and (ii), 96 plots would be used, but only 90 plots 
would be used for (iii) and only 80 plots for (iv). Hence the expected 
error variances will have to be adjusted to an equal number of plots. 
The most efficient of these designs would be the complete blocks design 
(iv), because no adjustments for localities would be required in determin¬ 
ing the average yield for a given planting date. The balanced incomplete 
design (iii) has been shown to be 74% as efficient as the complete blocks 
design (iv); that is, 4 replications of each planting date using (iii) would 
be required to give the same accuracy as 3 replications using (iv). 

The relative efficiency of design (i) will also be assessed with respect 
to (iv) by comparing the average variances of the difference between 
the mean yields at successive planting dates. If the error variance per 
plot is designated as <r 2 , then the variance of the difference between any 
two date means for the complete blocks design would be 2<r 2 /8 = .25 <r 2 . 
This variance can be compared with any of those in Table 6 by adjusting 
for the different number of total plots used in the two designs. Since 
96 plots were used in the “staggered” 3-date plan (i), the variances in 
Table 6 should be multiplied by 96/80 = 1.2 to put them on the same 
basis as that of the complete blocks design. 

The lowest variance in Table 6 is that for the difference in yield 
between dates 5 and 6, 0.21652 <r 2 . For comparison purposes, we mul¬ 
tiply this variance by 1.2, giving .26 o’ 2 . Hence this comparison is 96% 
efficient as compared with the same comparison if design (iv) were 
used. If we omit the comparisons for the first two and the last two 
planting dates, the average of the efficiencies for planting dates sepa¬ 
rated by one time interval (a fortnight) would be 94%. The compari¬ 
sons for the first two and the last two dates are quite inefficient, because 
these planting dates are not used at many locations under design (i); 
the relative efficiency is only 48%. The average efficiency for all nine 
of the one fortnight comparisons is 84%. The average efficiencies for 
all the comparisons of average yields given in Table 6 are presented in 
Table 12 below. It should be noted that these efficiencies are based on 
the premise that «j 2 is the same for all designs. If the increase of block 
size to 10 plots per block in design (iv) results in an increase in a 2 , the 
relative efficiencies would be higher than those given in Table 12. The 
incomplete blocks design (iii) would have the same block size; hence, 
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a 2 for this design would be expected to be the same as for the 3-date 
plan (i). The relative efficiency of (i) as compared to (iii) would be 
1/.74 = 1.35 times as great as the efficiencies given in Table 12. 

As stated above, we also consider the efficiency of the 4-date plan (ii), 
assuming that <j 2 would be the same for this design as for (i), even though 
only 6 instead of 8 localities are considered. The least square equation 
for (ii) was set up and a variance table such as Table 6 was then con¬ 
structed. The relative efficiencies of this design (ii) compared with the 
complete blocks design (iv) are also presented in Table 12. It might be 
mentioned that a balanced incomplete blocks design with 4 planting 
dates per block can be constructed, using 15 blocks. This balanced 
incomplete blocks design is 83% as efficient as the complete blocks de¬ 
sign, assuming <r 2 does not increase with the increase of block size for 
the complete blocks design. 


TABLE 12 

AVERAGE PERCENTAGE EFFICIENCIES OF THE 3 AND 4-DATE PLANS (i and ii) 
COMPARED WITH A COMPLETE BLOCKS DESIGN (iv) 

Number of Fortnights Separating Planting Dates 


Design 

1 

2 

3 

4 

5 

6 

n 

8 

9 

Avg. 

i* 

94 

65 

46 

36 

30 

25 

22 

|| 

■|| 

57 

lit 

106 

89 

68 

52 

44 

41 

34 

1 


74 

it 

84 

61 

43 

34 

28 

23 


17 


47 

iit 

93 

SO 

63 

48 

41 

35 


26 


60 


, 'Ws 

♦Omitting the first and last planting dates. 
tTJaing all planting dates. 

f. 


From Table 12, we see that the 3-date plan (design i) is only about 
3/4 as efficient as the 4-date plan (design ii). That is we could have 
obtained about the same variances of the, adjusted mean differences 
with 4 replications at each of the 6 locations for (ii) as with the 4 repliear 
tions at each of the 8 locations for (i). 

It might be noted that in this particular experiment, the most im- 
. portant comparison for the determination of the range of the optimum 
planting dates was that between dates 1 and 5. The efficiency of the 
estimate of the difference in the mean yield between these two dates is 
only 30% for the 3-date plan (i). Actually since the optimum planting 
date could have come before the first planting date used in the experi¬ 
ments, this range should have been about twice as large; the efficiency 
of this estimate would be even lower than 30%. Hence we can conclude 
























194 BIOMETRICS, SEPTEMBER 1948 

that the 3-date design is a very inefficient design for determining the 
optimum planting date. 

The efficiency of the determination of the optimum planting date can 
also be estimated from the variances of b and c in the quadratic regression 
equation (20). The variances of b and c for the 3-date plan (using 96 
plots) are given in equation (24). 

The variances of b and c for the four designs (i, ii, iii and iv), based on 
80 plots, are the following coefficients of <r 2 : 



(ii) 

(iii) 

(iv) 

Afy .01875 

.01000 

.00214 

. .00152 

Ac) .000882 

.000514 

.000320 

.000237 


These results show that the 3-date plan (i) is only about 8% efficient 
in the estimation of b and 27% efficient in the estimation of c as compared 
to the complete blocks design, if we can assume that a 2 will not increase 
with the latter. When compared with the incomplete blocks design, the 
efficiency is about 1.6 times as great. Comparing the 3 and 4-date 
plans, we note that the efficiency is almost doubled for the 4-date plan. 

From equation (30), we see that the variance of the optimum planting 
date (if the regression is quadratic) is 

(47) a\t^) = ^ [v 2 (b) + § * S (c)J, 

where <?*(&) and <r*(c) are found by multiplying the constants given in 
(46) by 6r 3 . In order to increase the efficiency of the experiment, we must 
either reduce a 3 , redesign the experiment so as to obtain lower coeffi¬ 
cients for <r*(6) and a 3 (c), or plan the experiment so as to have the opti- 
mum planting date fall near the middle of the dates used in the experi¬ 
ment (make b small). 

If we do set up the experiment so that b is expeeted to be small, then 
the efficiency of the experiment largely depends on <r*(6). From (46), 
we see that <r 3 (b) is very large for the “staggered" designs as compared to 
the randomized blocks designs (iii and iv). It has been suggested that 
we might increase Hie efficiency of the “staggered” 3-plan design by 
redesigning the experiment. For example we might use only the follow¬ 
ing sequences of dates—1, 2, 3; 2, 3, 4; 7, 8, 9; 8, 9, 10—planting each 
two locations. Although «r 2 (c) is reduced from 3<r 2 /4096 
#*(6) remains at cr 2 /64 (for 96 total plots). Ibis design is 
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not recommended for two reasons: (i) It does not improve the efficiency 
of the estimation of b, which is the chief contribution to the inefficiency 
of the “staggered” designs, (ii) There should be some estimate of the 
yield for all planting dates, especially since the hypothesis of a quadratic 
regression of yield on planting date may be false. If some other regres¬ 
sion equation is used, the reduction in the variance of c may be offset by 
an increase in the variance of some other estimate. 

If we were to use the sequences—1, 2, 3; 3, 4, 5; 5, 6, 7; 7, 8, 9—all 
planting dates, except the tenth, would be represented. However a 2 (b) 
would still be <r 2 /64 and a 2 (c) would be increased to 3(r 2 /39Q4. 


SUMMARY AND CONCLUSION 

This paper presents a new type of experimental design, the “stag¬ 
gered” design, for use with experimental material which can have but 
few consecutive plantings at a given locality. Two experiments involv¬ 
ing the determination of the optimum planting date of cotton have been 
conducted in British East Africa. The “staggered” design was used 
here because of a fear that an insect, Lygus simonyi, would tend to trans¬ 
fer from earlier planted plots to newly planted ones, hence distorting a 
proper assessment of the relationship between cotton yield and planting 
date. At one of the experiments, an index of the Lygus damage was 
determined for each plot. If this index truly reflected the Lygus damage, 
it appeared that there was no important migration for planting dates 
separated by four weeks or less. 

We have shown that the 3-date “staggered” design is decidedly 
inefficient in estimating the optimum planting date. The adjustments 
for locality effects are so great that there tends to be a very long range 
^ Indeterminacy of the optimum planting date. The following sugges¬ 
tions are offered to improve the efficiency of the estimation of the opti- > 
mum planting date under conditions of uncertainty with respect to 
insect migration: 

(i) Test the adequacy of the Lygus damage index as a true inffiqator 
of the infestation by the Lygus bug. If this index is reliable, or can be 
made reliable, then it appears that more planting dates should be used 
at each locality and that adjustments should be made for the Lygus 
damage by covariance techniques rather than by use of the daggered” 
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(ii) Even if the index is shown to be inadequate, especially when more 
than 3 planting dates are used at each locality, we would advise at least 
trying 4 successive planting dates at each locality. 

(iii) A great improvement would result if the localities did not differ 
so widely in their fertility. This “staggered” design would be much 
more efficient if the locality differences were not so pronounced as in 
these African experiments. 

(iv) It is advisable to plan the experiment so that the optimum 
planting date is near the middle date used in the experiment. 

(v) Ordinarily the experimenter hopes to secure some information on 
the correct allocation of experimental material as to changes in the num¬ 
ber of locations and replications within locations. In this case the small 
date-location interaction as compared to the date-replication interaction 
leads us to infer that one could not lose any information by using fewer 
localities and more replications at each locality. Since this result was 
obtained in both experiments, we feel that the use of more planting dates 
at a given locality with fewer localities being used probably would not 
materially alter the error variance. 



ON THE USE OF PARALLEL OR NONPARALLEL 
SYSTEMS OF TRANSFORMED CURVES IN BIOASSAY: 
ILLUSTRATION IN THE QUANTITATIVE 
COMPLEMENT-FIXATION TEST* 

William R. Thompson 

Division of Laboratories and Research , 

New York State Department of Health , Albany 

A method op moving-average interpolation has been presented 
elsewhere [1,2a] to provide a sound basis for quantal bioassay 
independent of any assumption as to the precise form of the dosage- 
response curve. As the method is relatively simple in application as 
well as in, assumptions involved, it may be regarded as a basic method 
in the sense [1] that it may be preferred except in any given situation 
where the use of some other method can be justified by improved pre¬ 
cision or by permitted technical economies. Moreover, it was shown 
that the Karber [3-6] and Reed-Muench [7] methods (designed to 
serve a like purpose) have serious defects, and that modifications intro¬ 
duced to eliminate these defects led directly to the moving-average 
interpolation method. The characteristics of various prevalent curve¬ 
fitting methods were also discussed with emphasis upon certain features 
usually ignored, especially with regard to systems of curve fitting by 
minimizing the sum of weighted squared deviations and use of adjusted 
or fictitious points. However, the writer was careful to emphasize that 
he has no intention of discouraging use of the curve-fitting methods, 
which appear to have found many valuable applications. 

Now with regard to the systems of curves that may be used to 
approximate either the dosage-response curve or what has been desig¬ 
nated previously [1] the fundamental (log dose, response) curve; it is often 
considered essential to a .valid system of bioassay that the log dose- 
response curves employed should all have the same shape. This is 
equivalent, of course, to providing that the curves be transformable to 
a system of parallel straight lines with log dose as abscissa and some 
univariant function of response (for example, the probit or the logit) as 
ordinate. The present purpose is not only to examine conditions under 
which such a stipulation may be justified, but in striking contrast to 
suggest other possible conditions where such a stipulation would be 
invalid yet where valid systems for bioassay by curve fitting may exist, 

♦Presented at the Annual Meeting of the American Statistical Association, Biometries Section, 
New York, N. Y., December 29, 1947. 
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and to provide illustrations of each of the two situations drawn from 
investigations of the quantitative complement-fixation reactions. Inci¬ 
dent to this development certain relations, familiar in physical chemistry, 
are used to portray a part of the nature of these biochemical reactions 
and to suggest possible features of reactions in vivo . 

Accordingly, suppose that we are dealing with a qmntal (all or none 
response) assay system. In a given case let D be the dose of an agent C 
allowed to act (possibly in some cases modified by introduction also of an 
amount V of an inhibitor F) on a given number n of subjects (indi¬ 
viduals treated), and let r be the number of these that respond critically, 
and p — r/n. Then, under given conditions with V fixed (possibly 
zero), assume that we have a continuous univariant function of x, 

(1) y = /( x) = /*(x, a, b, ■■■) 

•where 0 ^ y 5 1 for x > 0, which may be used in conjunction with some 
prescribed fitting process applied to certain parameters (a, b, etc.) so 
that the (x, y) curve will satisfactorily approximate the set of observed 
; points (D, p) obtained in a bioassay. Let K be the estimate of median- 
efFective dose defined by/(if) = 0.5. In order to simplify the discussion 
without loss of essential generality, assume that the critical response 
is so defined that /(x) is an increasing continuous function of x. Also, 
let L denote log x, and let i>(L) be identical with /(antilog L) = /(x). 
Then y = describes the approximate fundamental curve with y as 
ordinate and L as abscissa. 

Now, let X(x) and Y(y) be increasing continuous functions respec¬ 
tively of x and y. Further assume that these functions may be defined 
so that 

(2) X(x) = o + b.Y(y) 

where a and b are parametric constants (obviously 5 > 0). Further 
restrict the F-function by the identity, 

(3) FM = 0, 

which may be done without loss in essential generality. Accordingly, 
by the definition o^if = x if y = 0.5, we have 

(4) X(K) = a. 

la practice we may wish to relax these restrictions on Y(y) which, 
however, facilitate the present discussion. Thus replacing Y(y) in (2) 
by F»(y) = ±(F(y) + A), mid replacing a and b respectively by a® 
saS bo and applying the same curve-fitting process should result in the 
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evaluations of these parameters as b 0 = =bb and a 0 = a — b 0 • A with the 
resultant value for K identically the same, X(K) = a — a 0 + b 0 • A. An 
example would be use of Y 0 (y) = probit y which is equivalent to use 
of Y(y) = the normal equivalent deviation of y (for unit standard 
deviation), which may be denoted by N.E.D. of y; in this case b = bp 
and A = 5. Another example would be Y 0 (y) = log [(1 — y)/y] instead 
of Y(y) = log [y/( 1 — y)]\ then b 0 == ~b and Oq = a as A = 0. 

In a personal communication Berkson has informed the writer that 
he plans to change his definition of logit y to equal log [y/(l — y)] rather 
than minus that value. 

Obviously, in any case relation (2) has the advantage of permitting 
a fit of a straight line to experimental points [X(D), F(p)] graphically 
by inspection or by some least square technic with or without a weighting 
system or by some other specified curve-fitting process such as those 
mentioned above. Of course, relation (2) is equivalent to 

(5) Y(y) = a' + b'.X(s) for V = 1/b and a! = -ab\ 

Use of the form (2) or (5) is not intended here to indicate any prejudice 
as to the direction in which any deviations are to be measured nor as to 
the curve-fitting technic to be used. It seems preferable to suspend 
judgment on such issues temporarily, since the principal object of the 
present discussion is to indicate what kind of systems of curves may be 
considered appropriate to given situations, Especially with regard to 
whether or not in (2) we are to use a family of parallel straight lines to 
fit two or more sets of assay data; i.e., to make a simultaneous fit with 
the same value of b but individual values of a for each set, or possibly 
with a prescribed value of b. Moreover, as shown previously (1), the 
weighting systems commonly used for cases where n is relatively small 
(say 20 or less) take into account only the supposed principal source of 
error, biologic variation in individual resistance to the agent in question 
and sampling error corresponding to use of n individuals in a given case 
under test; the systems may be inappropriate for cases where n is large, 
for example, in the quantitative complement-fixation tests where n is 
about 100 million or more. 

The pioneer work in application of transformations as in (2) to data 
as obtained in bioassay appears to have been that of Von Krogh [8], who 
employed 

(6) X(x) = log x and 

(7) Y(y) = log 
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to convert observed points to approximately linear form, where y is the 
degree of hemolysis induced in sensitized red blood cells by a given 
amount x of complement under given conditions. This was used by 
Wadsworth, Maltaner, and Maltaner [9] and others [10-18, 2] in 
quantitative studies of complement fixation and in complement titra¬ 
tion, which furnished the basis for the quantitative tests for specific 
antibody content in blood or other body fluids. Thus, the logistic 
function converted to logarithmic form 

(8) log x = logK + A. log 

-*■ y 

was employed 1 , where K is the estimated amount of complement required 
under otherwise the same conditions to induce 50-per-cent hemolysis 
(the median-effective dose) and h is a parametric constant. Under these 
conditions in (2) we should have b = A and a = log K. 

The logistic curve has been applied thus to other forms of bioassay 
by Wilson and Worcester [20,21] and by Berkson [22,23], with different 
curve-fitting methods. The former have also considered [24] general 
relations permitting use of other curve forms. They stress the impor¬ 
tance of recognizing that no one curve form should be assumed suitable 
for use in all situations, which may involve different laws of biologic 
reaction of various animals to various biologicals. Some of the possible 
types of difficulty that may be encountered have been discussed else¬ 
where [1,2a]. 

The principal rival of the logistic as approximation to the fundamental 
{log dosage-response) curve is the ■integrated normal curve (taking probit 
y for F 0 (y) = 5 + the N.E.D.). Winsor [25] has shown that either 
may be fitted to the other so well over the ranges usually employed in 
bioassay that it would ordinarily be difficult to discriminate between 
them on the basis of goodness of fit to experimental data and usefulness 
in estimation of median-effective dose. Accordingly, it would seem naive 
to use an experimentally observed correlation between two methods, 
one using the logistic and the other the integrated normal curve, as an 
argument in favor of either method in a given situation. Likewise it 
should be expected that the integrated normal curve could be used in¬ 
stead of the logistic to represent the hemolysis data, and this has been 
done by Ipsen [26]. Bliss and Cattell [27a] in an extensive review cite 
Ipsen’s work, but apparently 1914 is incorrectly, given as its publication 
date in place of 1941. 

Now consider certain special conditions (denoted by roman numerals) 
that may or may not be assumed to apply to given reaction systems. 


igee footnote cm page 208. 
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I. Two or more different preparations of essentially the same reagent 
except for possible differences in concentration are to be compared in 
tests where x , the amount used in a given test, is variable; but all other 
conditions are planned to be essentially the same (for example, an amount 
of diluent is used to make the total volume of material introduced 
essentially the same). * 

■ Under condition I, if K { is the median-effective dose of the i-th prepa¬ 
ration and Xi is the dose of this preparation employed (i = 1, • * * , N), 
and we take xJKi = x t /Kj(j = 1 , • • • , N), then the expected effeets 
(Vi ) Vi) are the same. Accordingly, by relations (2) and (4), we have 

X(x<) - X(iQ + h.Yfa) and 
(9) 

Xfa) - X(K f ) + h.Yiyi), 

where Y(y { ) = F(g,-). Obviously, if X{x) = log x , then b { s b ,• ; and it 
is readily demonstrated that if b { = &,■ identically (for every possible i 
and j under the circumstances and every possible x/K ratio) then X{x) 
must be a linear function of log x; and log x obviously will serve as well 
instead for our purposes. 

Now, this situation, where the use of X(x) = log x in relation (2) 
leads to a system of parallel straight lines (with suitable choice of the 
F-function), appears to be at least approximately realized in many 
types of bioassays, even under conditions different from those of I. 
However, this has led occasionally to incautious statements which seem 
to proscribe any possibility of another satisfactory basis for bioassay. 
Accordingly, consider some alternative conditions that may throw light 
on this issue. 

II. Assume we are dealing with a single preparation of the active 
agent C, the amount-x t - being used in the i-th test (i = 0,1, • * • AT), but 
also introduced is an amount 7* of an inhibiting agent F (except that 
V Q = 0). It is assumed that this amount of inhibitor reacts quantita¬ 
tively with a proportional amount of C to take it out of action (essen¬ 
tially as if firmly bound in chemical combination), but that x f is so taken 
as to leave some excess of active agent C, the excess being — g- 7 t - and 
g is a constant. Further assume that the combined material (com¬ 
pounded of C and F) does not influence the reaction appreciably. Obvi¬ 
ously, then Ki = K 0 + g* 7 <; * and if = x 0 + g- Vi then y t = y 0 in any 
case. Accordingly, it is readily demonstrated that it is necessary and 
sufficient in (2) under condition II that X(x) be a linear function of x if a 
system of parallel lines is attained. The argument proceeds as under (9) 
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in even simpler manner but with x instead of log x as the principal 
form of X(x). The latter is clearly impossible under condition II, 
X(x) = log x will not serve in (2) to give a family of parallel straight lines. 

This may seem a rather extreme, perhaps an unnatural case. How¬ 
ever, if the agent C resembled hydrogen-ion from hydrochloric acid, 
inhibitor F resembled hydroxyl ion from sodium hydroxide, and the 
compound formed resembled water we might find the conditions of II 
approximately realized. However, we are led at once to consider the 
corresponding situation where C and F do not react completely but some 
of each remains dissociated. This is the alternative condition followihg. 

HI. Assume that we are dealing as in II with a given preparation C 
introduced in the amount x f with an amount F< of inhibitor F also 
introduced, V 0 - = 0 and i = 0, 1 , • • • , N. Assume that C and F may 
react in accord with the stoichiometric equation 

(10) F + C = FC 

which we assume to be reversible. Assume that this reaction tends 
closely to approximate equilibrium conditions where [C]< , [F] { , and 
[FC\i may be used to denote the amounts respectively of C, F, and FC 
present at equilibrium, and that these amounts are expressed in equiva¬ 
lent units in accord with relation (10) and are related in accord with 


(ID 


l£lidhdi _ k 
[FC]< ~ k 


where k is a constant times the total volume in which the agents are 
able to act, which, in turn, is assumed constant and hence k is constant' 
here. This is the same relation found by Northrop and Hussey to apply 
to trypsin and antitrypsin of the blood {28,29}. Assume that x and » are 
measured in the equivalent units (where v = g-V and g is a constant as 
under condition II), then 

(12) x f = [C} 4 + [K?Ji , and 


(13) -*i = g.Vi - [F} 4 + [FC]i . . . 

Of course, as v a = 0, Xo = [CJo. Further assume that in any case under 
Hi the resultant y t depends only upon {C] 4 , the amount of active agent 
C. 

Now, examine the characteristics of ni by focusing attention on 
the special case, 
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(14) ' [C] i = x 0 for i = 0,1, , N. 

Accordingly, yi = y 0 for every i, whence (11) gives 

(15) [F],- = [FC]<.k/x 0 , and 

Vi = [FC]<.(1 + k/x 0 ) 

(16) 

= (Xi — x 0 )(l + k/x 0 ). 

Accordingly, v ( — v, = (a;,- — *,)(1 + k/x 0 ) provided that y t — y s = y 0 . 
Thus any endpoint y 0 may be used to furnish a measure of differences in 
amount v of- total inhibitor present, since this is proportional to the 
corresponding differences in the amount x of reagent required to produce 
the specified response y 0 , but the factor of proportionality (1 + k/xf) 
will depend upon the endpoint chosen. For many purposes it seems desir¬ 
able to choose y„ = 0.5 as the endpoint (this usually being about the 
point at which the corresponding x may be determined with greatest 
precision); and then, obviously, it is convenient to express x in terms of 
an arbitrary unit (13) taken so that K 0 = 1. Then 

(17) a, = (K { - 1)(1 + k). 

Of course, k is an unknown constant. However, if V is given in terms 
of volume of inhibitor preparation introduced, then g = v/V is the 
concentration of inhibitor in the preparation; and, in default of know¬ 
ledge of k, we may use the increment ratio titer 

(18) T — (Ki — K,)/(y { — Vj) 

which equals g divided by the unknown factor (1 + k). This reduces to 
T — (Ki — 1 )/V f for 7f = Vo — 0; but it may be preferable to have 
both yalues of V greater than zero to avoid discrepant results that may 
be more likely in the region where little or no inhibitor is present. - 

Now, consider the possible influences of these conditions on the 
form of the curve, f(x), that may be useful in the bioassay by relatiohs 
(1) and (2) with suitable definition of X(x) and F(^) and restribticSi 
of the range of y to that proposed for use in such assays. Relation (16) 
gives for the case, y { = y<> = y (taken arbitrarily in the : given range), 

(19) x { = Xo+ Vi/(1 -I- kjxf), whence, 

(20) Xi/Xo = 1 + Vi/(x 0 -f- k). . 
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IHA, Accordingly, if k is negligibly small relative to x 0 , then (19) 
gives approximately Xi = x 0 + = x 0 + g* V t , the same as condition 

II; and no transformation with X(x) = log x will yield a set of parallel 
straight lines in (2) respectively corresponding to various values of Vi , 
but X(x) = x instead will yield that result approximately. 

JjJLb. On the other hand, if is negligibly small relative to x Q + k 
or if x 0 is negligibly small relative to k, then Xi/x Q is approximately 
constant for any bioassay (set of tests in which v { = v, a constant, but 
Xi is varied to give a consequent response y correspondingly variable). 
Accordingly, X(x) = log x would result approximately in a set of paral¬ 
lels in (2) by suitable choice of Y(y). 

ULC . However, there is a vast possible middle ground wherein v { 
is not negligible relative to x 0 + k, nor either of the latter negligible rela¬ 
tive to the other. On this ground, neither x nor log x may be used for 
X(x) to provide a system of parallels in (2), yet by (16) or the derived 
relations (17) or (18) we see that there exists an equally valid basis for 
assay in all eases; results based on amount of agent required for a speci¬ 
fied effect (y 0 ) furnish a sounder foundation for assay than any based on 
suppositions of a system of parallel straight lines to be found by suitable 
transformations which may not exist, or may be elusive. In an Appendix 
it is shown that they do not exist. 

As in the case of any general statement, so with the denial of existence 
of any other valid basis for bioassay than that valid under condition I, a 
single example cited to contradict the general statement is all that is 
needed to disprove it. More than one contradicting hypothetical exam¬ 
ple has been exhibited. It is not claimed that these represent anything 
more than a first approximation to any real situation; but condition III 
comes fairly close to describing conditions that are found in several of the 
quantitative complement-fixation tests with C representing complement 
and F representing an antigen-antibody complex. In the case of the 
reactions with tuberculous immune serum and antigen [10], if A denotes 
the antigen and B the antibody, then the reactions observed closely 
approximate what would be expected if A -J- B — AB and AB + C — 
ABC were the stoichiometric relations involved, with AB in place of F 
as the antigen-antibody complex which inhibits the action of comple¬ 
ment. Moreover, the system behaves approximately as if not only the 
second reaction but also the first followed a mass law resembling that 
given in relation [11], as may be noted in figure 3 in the cited article 
[10]. Linear relations are found in nearly all if not all these systems 
between the amount K of complement required for 50-per-cent hemolysis 
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and either the amount of antigen or of antibody in the presence of the 
optimum amount of the other. An example is given in Figure 1 taken 
from an article by Doctor Rice on the vaccinia virus system [17]. In 
Table 2 of the same article are given values of h (there called l/ri) for 
relation (8), which is the slope b as given in relation (2), as found 
experimentally under a variety of circumstances with different antigen- 
antibody systems. In the presence of both serum and antigen, the slope 
h may be seen to tend to increase as the amount of complement used to 
bring about a conveniently measureable reaction increases only in the 
case of the vaccinia virus system; in all others it decreases or remains 
about the same. The last situation is exhibited strikingly with the 
pneumococcus system. Approximately constant h is found for the higher 
preliminary incubation temperature but not for the lower, as indicated 
in Doctor Rice's table. Constant h was also found with the egg albumen 
system studied by Maitaner and Maltaner [12]. 

figure x 

I2h / / 



The ordinate is K, the median-effective dose of complement, shown as an approximately linear function 
of amount of vaccinia virus antigen (V.V.C.) present with the optimal amount of pooled antiserum 
(V.V.R.S.) or of normal serum (N.R.S.) of rabbits. 


This figure is reproduced, by permission, from the Journal of Immunology , 1946, BS t 225-236 (Williams 
and Wilkins Co.). 
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In an appendix it is shown that under condition III there exists no 
transformations, X(x) and Y(y) in (2), that yield a system of parallels. 
On the other hand, it is readily seen that we might measure variation in 
the amount of one inhibitor even in the presence of other inhibitors under 
certain similar conditions provided the latter inhibitors were present in 
approximately constant amount. This may be actually the case in some 
living systems and suggests a possible explanation of the approximation 
to condition IIIB often observed. 


APPENDIX 


THE POSSIBILITY OF SOLUTIONS UNDER CONDITION III 

Assume tentatively under condition III that there exists a continuous 
function of x, X(x), for x > 0; that its first derivative X'{x) exists and is 
continuous, positive, and finite for x > 0; and that the X-response curves 
for different assays under the given conditions all have the same shape. 
The last is equivalent to assumption that a function Y(y) exists as 
previously defined such that (2) holds with b constant for all assays and 
a the sole parameter distinguishing the, corresponding parallel straight 
transform lines. Then b > 0 and the first derivative Y\y) exists and is 
continuous, positive, and finite for 0 < y < 1. 

To simplify the notation let v = and x w denote the corresponding 
Xi under III which produces the response y in the presence of an amount 
of total inhibitor equal to v. Then, as before, x 0 is the value of x, for 
c = 0, and (19) gives 


£» = £o + v/{l + k/x o), and 

1 (Al) 

' K, = E 0 + v/(l + k/K 0 ) 
sinee x 9 = = 0.5. Accordingly, (2) gives 

(A2) + 

since F{0.5) « 0 and thus a, = X(K w ) m a parametric constant and b is 
a constant. Accordingly, 


(A3) 


dX(x^) dXj -xt //_, \ dx v 

dY(y) ~~ dx w * dY(3f) ~ A W ’ dY(y) 


a constant; and therefore 


b, 



(A4) IU);g = rW, 
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Now, (Al) gives dxjdx 0 =* 1 + kv/(x 0 + k) 2 . 

Let Ax = x, — x 0 ; then (Al) gives 

Ax = v/(l + k/x o) whence v = (1 + k/x a )-Ax, 
whence (A4) gives 

(A5) X'(x 0 + Ax) • [1 + k- Ax/x 0 (x 0 + k)] = X'(x 0 ) 


(A6) 


X'(x 0 + Ax) - X'ixo) = -h-X'Cxp ± Ax) 
Ax • x 0 (x 0 + k) 


Accordingly, we obtain the linear differential equation, 

(A7) (x 2 + kx)-X" (a) + k-X\x) = 0, 

I ' 

which may be considered as linear of first order in X'(x) . By well-known 
methods [30,31] this yields the general solution X'(x) = Ci(l + k/x) 
and hence 


(A8) X(x) = C 1 (x + ft-log a:) + C 2 


where C, and C 2 are arbitrary constants except that C, > 0 in the present 
case as X'(x) > 0 for x > 0, 

Accordingly we have in (A8) a necessary condition if X(x) exist under 
relations (Al) and (A2). Obviously, we may take C 2 = 0 without loss 
in essential generality, and likewise take C, — 1. Thus, if there is any 
solution to (Al) and (A2) for X(x), then we may use 

(A9) X(x) — x + ft-log a; 

as well as any for our purpose. Now, (A2) gives 


(A10) X(x.) - X(K,) - X(x 0 ) + X(K 0 ) = 0. 

Define w equal to the left member of (A10) with x + fr-log x substi¬ 
tuted for X(x). Then 


(All) oi = x, — K, + fc-log^- - a:o + K 0 - fc-logjk 

\ 

whence (Al) gives 


(A12) 


, JL'V&L T Ao) _ l+v/(K 0 + k) 

(x a + k)(Ko + k) K 10g 1 + v/(x<, + k) ■ 


Obviously, for a to vanish identically, we have a solution if k 
(approximated in condition IIIA) and then we take X(x) = x. 


0 
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However, in any case for cc s 0, then differentiating both sides of 
(A12) with respect to v gives 


(A13) 


__ k(x 0 _ jSTq) ___ k(x 0 Kq) 

“ (x 0 + k)(K 0 + k) (xo + k + v)(K 0 + k + v) 


whence: either k = 0, or x 0 = K 0 or v = 0. 

Thus, if k 0 there is no solution to (Al) and (A2). It is readily verified 
that 


(A14) 


03 

k-v 2 

1 

1 

K 0 

< 2K 0 ' 

(K 0 + k) 2 

(*o + kf 


which indicates the conditions for which an approximate solution to 
(Al) and (A2) exists. 

The lack of such solutions under condition III means only that 
transformations of assay curves to straight parallel lines by means of 
some functions X(x) and Y(y) do not exist. Valid conditions for assay 
relative to some fixed reference point (y 0 ) have been shown to exist 
always under condition III. 


Un a general discussion of bioassay in another article [1] a different notation was employed as 
there more convenient, but a key was given (page 116, footnote 2) to other notations including that 
used in a previous article [13] on this subject which is similar to the present notation. However, it is 
planned to present a notation system elsewhere [19] to facilitate more comprehensive discussion of 
fundamental notions in complement-fixation reactions. 
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QUERIES 

QUERY : I am investigating the production of hard seeds in crim- 
59 son clover. Since individual plants are extremely variable, I have 
devised a technique for splitting young plants—then growing the 
halves to maturity. The two half-plants vary much less than whole 
plants in their production of hard seeds. 

The table shows the percent germination (that is, soft seeds) in 
half-plants grown in pots in the greenhouse. The arrangement of the 
pots was random. Each comparison consists of four pairs of half-plants. 
The third pair in Comparison II shows the typical variability of plants— 
in this respect, this experiment is the most uniform I have run. 


PERCENT GERMINATION OF SEEDS FROM 16 PAIRS OF HALF PLANTS IN FOTJR 

COMPARISONS 


Compa 

rjlson I 

Comparison II 

Comparison III 

Compai 

nson IV 

(1) 

<2) 




(6) 

(7) 

(8) 

eipita 

capata 

ClPlfa 

c%piJci 


capita 

cipata 

capita 

73 


97 

92 

92 

77 

70 

92 

76 


93 

80 

83 

75 

81 

71 

94 


59 

30 

97 

84 

98 

78 

76 


100 

95 

91 

83 

100 

98 

319 

292 

349 

297 

363 

319 

355 

339 


My main interest is in the main effects. If the high level of calcium 
leads to hard seed production, then all C 2 comparisons would be ex¬ 
pected to have lower percent germinations. With the exception of two 
pairs, this was the case. The higher levels of phosphorous and potassium 
failed to produce such an effect. 

I am not obviating the possibility of interactions. I will be very much 
interested in your method for obtaining these and in a test of significance 
of the effects. 1 : \ 

V 

These percentages are perhaps based on small and varying 
ANSWER: numbers of seeds. For exact worfc this should be investi¬ 
gated and suitable transformation made. Such refine¬ 
ments might affect the results given below but would not change the 
methods described. ' 

Preliminary calculations indicate that the only effects that approach 
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significance are the calcium main effect and the calcium-phosphorous 
interaction. The former is made up of the four differences: 

(8) — (1) : c 2 piki — CiPxki = 339 — 319 — 20 

(6) - (3) : c 2 pih - ctfxh = 319 - 349 = - 30 

(4) : o 2 p 2 ki * o%p 2 ki ^ 397 303 = — 06 

(2) — (7) : c 2 p 2 k 2 — c{p 2 k 2 = 292 — 355 = — 63 

Total —139 

That is, calcium depressed the percent of germinating seeds, presumably 
stimulating the production of hard seeds. This stimulating effect of 
calcium was enhanced by the presence of phosphorous, the difference for 
computing the interaction being 

(-66 - 63) - (-30 + 20) = -119 

The two mean squares to be compared with the error variances are 
(-139)732 - 604 and (-119)732 = 443. 

The two error variances involved, one between half-plants from the 
same plant and the other between the sums of these two half-plants, 
result in treatment comparisons at different levels of precision. 

The error variances for the three main effects and for the second 
order interaction are the same, each being equal to the variance of 
half-plants from the same plant. To see this, observe that the variance 
for the calcium effect is, using the numerical designations of the treat¬ 
ments, the variance of 

(8) + (6) + (4) + (2) - (1) - (3) - (5) - (7), 
which may be written, 

[(8) - 003 + K6) ~ (5)3 + [(4) - (3)] + [(2) - (1)3- 

Each pair is made up of four differences between half-plants from the 
same plant. The variance of these differences is 133.24, double the mean 
square per half-plant shown in the analysis of variance. Since there are 
four differences per treatment pair and four such pairs, the variance of 
the total is 16(133.24) = 2,131.84; hence, the variance per half-plant is 
2,131.84/32 = 06.6. There are three degrees of freedom for each of the 
four treatment pairs. For testing the significance of the calcium effect, 
we now have 

F — 604/66.6 - 9.07, F. 0l = 9,33, 



QUERIES 


213 


The calcium-phosphorous interaction is given by 

[(1) + (2)] + ,[(3) + (4)] - [(5) + (6)] - [(7) + (8)]. 

Here, as in the other two first order interactions, the treatment pairs are 
made up of four sums of half-plants, the variance of a sum being 782.6. 
Calculating as before, the variance per half-plant is now 391. F is 
443/391 = 1.13, obviously non-significant. 

The analysis of variance of the entire experiment is shown in the 
table. 


ANALYSIS OF VARIANCE OF PERCENT GERMINATION 


Source of Variation 

Degrees of 
Freedom 

Mean Square 

Treatment: 



C 

1 

603.78 

P 

X 

11.28 

K 

1 

0.28 

CP 

1 

442.53 

CK 

. 1 

69.03 

PK 

1 

16.53 

CPK 

1 

87.78 

Differences between half-plants from the same plant 

12 

66.62 

Sums of half-plants from the same plant 

12 

391.32 


Had querist wished to evaluate all the effects with equal precision, 
he could have used balanced incomplete blocks with a pair of half-plants 
in each block. He has actually repeated the experiment with this 
design, and has been asked to present his results in “Queries”. 

This poses a nice problem for every experimenter; to foresee his 
needs with sufficient certainty to choose the exact design which will 

satisfy them. * George W. Snedecor 


QUERY : I enclose data I have collected on chick growth and have 
60 indicated what appears to be four possible ways of showing the 
significance of the differences between the means obtained. My 
problem is one of how best to present these data and their implications 
in print. Your comments and suggestions as to' the correct usage of 
these four methods will be greatly appreciated. 

In a feeding trial with chicks designed to study the comparative 
feeding values of six protein supplements, fourteen chicks were used per 
lot to begin with. The birds were all of the same sex and age and were 
from the same flock. During the course of the experiment, some chicks 
died due to unknown cause. 
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On the basis of chemical analyses, one half of the horsebean seed meal 
protein was replaced in the basal ration by the indicated protein supple¬ 
ments. AH rations had the same level of crude protein. 

The birds receiving each ration were housed together. Record was 
kept of the amounts of feed eaten by the lots only. The results are given 
in the table. 


FINAL WEIGHTS OF CHICKS AT SIX WEEKS (GRAMS) 




Source of Prot< 

in Supplement 



linseed 

Soybean 

Sunflower 




Horsebean 

Oil Meal 

Oil Meal 

Seed Oil Meal 

Meat Meal 

Casein 

1 

179 

309 

243 

423 

325 

368 

2 

160 

229 

230 

340 

257 

390 

3 

136 

181 

248 

392 

303 

379 

4 

227 

141 

327 

339 

315 

260 

5 

217 

260 

329 

341 

380 

404 

e 


203 

250 

226 

153 

318 

7 


148 

193 

320 

263 

352 

8 



271 

295 

242 

359 

9 



316 

334 

206 

216 

io 



267 

322 

344 

222 

XI 



199 . 

297 

258 

283 

, 12 



177 

318 


332 

13 



158 




14 



248 




Mean 

160.2 

219.5 

246.9 

324.4 

271.6 

306.9 


ANALYSIS OF VARIANCE 



Soiree of Variation 


Degrees of Freedom 

Mean Square 


Ptoteia Suppleasneinte 


5 

43.627 


Error * 


65 

4.645 


Four methods of comparison between lot are proposed: 

1. MdOciai limits for each lot mean. 

2- Difference required for significance at 0.05 level for each individual 
pair of lots. 

, 3. The largest value of the difference required for significance is 
tafen as the common Value for testing the significance for all pairs of lots. 
"Sri* F»e&ch Individual pair of lots the value of t is used to test s»- 





















Your proposed method 1 is not a test of significance, but it 
ANSWER: is sometimes useful for assessing the results of an experi¬ 

ment like this if no set of independent comparisons is 
provided in the design. 

Methods 2 and 4 produce results that are essentially the same. The 
15 comparisons contemplated are not independent. The probabilities 
associated with them are not likely 0.05, and there is no way to de¬ 
termine what they are. For further comments on this method, see this 
Journal, Vol. 1, p. 26. 

Method 3 is an inexact variant of method 2. The advantage is slight 
because all of the significant differences must be calculated in order to 
isolate the largest. 

Subject to the severe limitations discussed below, I would suggest 
some such orthogonal set of comparisons as the following: 

(a) Standard source (horsebean) vs. other sources. 

(b) Vegetable sources vs. animal. 

(c) Meat vs. casein. 

(d) Two comparisons among vegetable sources. 

Although your method of testing is inexact, this is a difficulty which 
I consider minor because decisions are usually based on many bits of 
evidence of which the test of significance is only -one. Of far greater 
import are difficulties inherent in the design of the experiment. There 
is no suitable replication of the treatments. As a consequence, there are 
only ambiguous answers to questions supposed to have been cleared up. 

Your estimate of error is based on the variation in weight of chicks 
housed together in the several lots. Two doubts assail such an estimate: 
(i) Are these weights independent or are there correlations due to 
environment? (ii) To what extent is the variation among these weights 
affected by differences in food intake? In the'experimental design, no 
provision is made for answering either question. 

In the treatment differences, possible effects of environment and food 
intake are confused with any real effects of the sources of protein. There ^ 
is no way to untangle them. Some method of regression might be used 
to compensate for the differences in food consumed, but there is no 
comparable estimate of error. , 

You are to be commended for segregating the sexes for this experi¬ 
ment. I have encountered data in which confusion was worse confounded 
by unknown inequalities in the sex ratios. 

It is clearly my opinion that questions about tests of significance are 
of little weight as compared to those about the design and conduct of the 
experiment. Unless unambiguous information, is incorporated in the 
data statistical methods for extraction are futile. George W. SNBDEcon < 



THE BIOMETRIC SOCIETY 


The Western North American Region of the Biometric Society was 
organized in Berkeley, California on June 24. Regional officers, nomi¬ 
nated at the meeting and since elected by the Council, are 

Vice-President, F. W. Weymouth, head of the department of physiol¬ 
ogy at Stanford University; Secretary-Treasurer, Mrs. Bernice Brown, 
statistical analyst at Project Rand of the Douglas Aircraft Co.; regional 
committee members, James E. Holloway, entomologist of the University 
of California, Jerome C. R. Li, assistant professor of mathematics at 
Oregon State College, G. A. Baker, assistant professor of mathematics 
and assistant statistician of the experiment station of the University of 
California, and Wade Rollins, research assistant at the University of 
California, College of Agriculture at Davis. 

The Australian Region is well along in formation with E. A. Cornish 
and Helen N. Turner carrying the ball. A Regional meeting is planned 
at Hobart in January, in conjunction with the meeting of the Australian 
and New Zealand Association for the Advancement of Science. Perma¬ 
nent officers wifi be named at that time, and decisions made as to 
future policy and regional meetings. 

Several New Zealanders, interested in the Society, have proposed or¬ 
ganizing a New Zealand Region because of their distance from Australia. 
Dr. J. T. Campbell, senior lecturer in mathematics at Victoria University in 
Wellington, has arranged a gathering of New Zealand statistical workers at 
the University in late August and plans to bring up the question at that time. 

The Eastern North American Region is holding a joint session with the 
Vital Statistics Section of the American Public Health Association on Wed¬ 
nesday,November lOduringtheannualmeetingoftheAssociationinBoston. 

A round table discussion of Morbidity Surveys is scheduled for 
9:30 A.M. in the Paul Revere Banquet Hall, Mechanics Building. 
General discussion will follow the panel participants. 

'Moderator: Mr. George St. J. Perrott, Chief, Division of Public Health 
Methods, U. S. Public Health Service 
Public Health Statistician: Mr. Theodore D. Woolsey, Division of Public 
Health Methods, U. S. Public Health Service 
Sampling Statistician: Mr. Nathan Keyfitz, Mathematical Advisor, 
Dominion Bureau of Statistics, Ottawa, Canada 
Opinion Poll Statistician: Dr. George Gallup, Director, American Insti¬ 
tute of Public Opinion, Princeton, New Jersey 
Epidemiologist: Dr. Alexander D. Langmuir, Associate Professor, De¬ 
partment of Epidemiology, School of Hygiene and Public Health, 
Johns Hopkins University 

Health Officer: Dr. Huntington Williams, Commissioner, of Health, 
City of Baltimore 
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BIOMETRY . 

R. A. Fisher* 

The rise of biometry in this 20th century, like that of geometry in 
the 3rd century before Christ, seems to mark out one of the great ages or 
critical periods in the advance of the human understanding. From its 
humble beginnings in meeting the needs and satisfying the practical 
requirements of the work of land measurement and architecture, geometry 
found its way, by the enchanting clarity of its concepts and processes, 
into the heart of what the Greek world meant by a liberal education; 
an education that is fit for free men who would think for themselves, and 
not fit only for slaves and officials whose aims and concepts were dictated 
from above. It was the liberation of the spirit experienced by the Greek 
students of geometry which gave the subject to them the exalted status 
it undoubtedly held, and won the veneration of the entire period. We 
can, I think, partly understand their feeling, when we realise that here 
for the first time the human spirit came to handle abstractions, of their 
nature necessarily timeless and perfect, and to handle them with con¬ 
fidence, because they were well defined. The well defined abstraction 
seems, in fact, to be the invention of the Greek geometers, and an inven¬ 
tion of lasting significance to human thought. 

But it was not merely its conceptual clarity which gave to geometry 
its fascination. With well defined concepts the intellect found itself 
capable of acting with unprecedented efficiency. Men learnt to reason, 
deductively, from well defined abstract concepts, to cogent and irre- 
frageable conclusions. And with its use, with its exercise, in the field of 
geometry, the 'principles of deductive reasoning came to be understood, 
or at least, to be codified, so as to give rise to the subject known as Logic. 
It has been a fashion among some modern mathematicians to speak of 
Mathematics itself, or themselves, as but a branch of logic. This, 6f 
course, is but a formalisation, appropriate to a purely deductive habit 
of mind. The historical fact unquestionably shows logic as a later 
growth, a formulation of the thought processes, in which the practice of 
geometry had already made man sufficiently adept, to ensure agreement 
as to general principles. And the conclusions of geometers themselves, 
apart from the artistically unified presentation which Euclid gave them, 
embody the horse-sense of ages of predecessors trying to measure 
accurately, and using increasingly subtle and indirect means of measure¬ 
ment and of accurate construction. Even in Euclid's treatise it is dis- 

*R. A. Fisher, president of the Society, addressed the Inaugural meeting of the British Region in 
London on April 2d. His address was such an adroit summary of the reasons for launching a new 
scientific Society that we are publishing it in this issue of Biombteics. 
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cemible that the grand aim towards which the whole edifice of theorems 
and problems is directed is the practical and exact construction of the 
five perfect or absolutely regular figures which are possible in three 
dimensions. 

Now, I suppose circumstances might have conspired to give “to sur¬ 
veying, or to astronomy, or to any other subject sufficiently rich in 
observational detail, the honour of compassing the second great stage 
of intellectual liberation, by making known the principles of that second 
and scarcely explored mode of logic, which we know as induction; of 
clarifying the principles of reasoning from the particular to the general, 
from the observations to the hypotheses, in ways necessarily inaccessible 
to purely deductive logic, or to any mathematics which can properly be 
regarded as derivable wholly from deductive logic, of making men free to’ 
recognize with certainty the consequence not of axioms or dogmas* but 
of carefully ascertained facts. But, as it has happened, it has been 
reserved for Biometry, the active pursuit of biological knowledge by 
Quantitative methods, to take this great step; and the man who in the 
nineteenth century did more than any other to prepare the way was, 
I think, undoubtedly Francis Galtom 

The peculiarity of Galton’s temperamental make-up which led him 
to play this part was/ in my opinion, the insistent need that he felt to 
think constructively about variable phenomena. Unquestionably he 
was led to concentrate his attention upon variation, through the central 
place which variation held in the theory of evolution, which his half¬ 
cousin Charles Darwin had put forward, and which influenced Galton 
profoundly, as appears clearly in his book Hereditary Genius , published 
after the. Origin by only ten years. To Galton, however, variation of 
an appeal, or a fascination, as much in meteorology for 
example m In heredity, and this appeal we can appreciate if we consider 
l#hat an obetade to coherent thought mere quantitative variation had 
formerly bem. Even now, common phrases and modes of th<m#t ex- 
press this impotence fmm^whieb Galton ? s generation was just emerging, 
thaaai^ largely to Qaiten’s own efforts. If one were to say: “Nothing 
definite can be asserted about the political opinions of entomologists, 
for their opinions vary”, even an audience of biometers might admit the 
statement as rational, although they all know perfectly well, from their 
■ Own constant experience, that a great variety of definite statements 
be made about every variable phenomenon that had been studied. 
^WiSioui this experience, however,—and the bulk of mankind are without 

the modem concept of frequency <fetributions, 
pf thinking coherently in terms of frequency distribution 
Kemrn®' to a fidl stop. The urge, apparent in eJeipente^ 
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books of statistics, to find “measures of central tendency” as conceptu¬ 
ally constant substitutes for the really variable values, embalms one of 
the earliest efforts to evade the intellectual difficulty. With better 
apparatus of thought at our disposal, we can now reinterpret the meas¬ 
ures as statistical estimates. As such they make sense; but we should 
remember that they are still introduced and taught at a stage when there 
is as yet no thought of estimation. Then again, we still have the ad¬ 
ministrative compromise, such as “A fair wage is one sufficient to main¬ 
tain in decency a wife and three children”, and it is with pained surprise, 
and with great reluctance, that the administrator learns to admit that 
such a decision will leave half the real children of the country, belonging 
to families of four or more, insufficiently provided for, and that at the 
same time it saddles the wage fund, and therefore the purchaser of 
goods, with providing for about twenty million non-existent children. 
In this aspect family allowances constitute an elementary recognition of 
the unwelcome fact of biological variability. 

The primitive function of the biometric movement, characteristics 
of the present Century, is therefore to conserve by constant use, and 
incidentally to improve and refine, the thought forms, which make pos¬ 
sible an understanding of variable phenomena. These phenomena 
come to our’ knowledge by observation of the real world, and it is no 
small part of our task to understand, design and execute the forms of 
observation, surveys or experiments, which shall be competent to supply 
the knowledge needed. The observational material requires interpreta¬ 
tion and analysis, and no progress is to be expected without constant* 
experience in analysing and interpreting observational data of the most 
diverse types. Only so, as I have suggested, can a genuine and com- 
prehensible formulation of the processes of inductive reasoning come 
into existence. As we bear these objects in mind, as we allow ourselves* 
to, appreciate their immense practical importance; as we yield to their, 
mtellectual fascination, so, it is common experience, we come to think 
of ourselves less in terms of the special scientific disciplines, less as; 
chemists or entomologists, geneticists or mathematicians, and more in 
terms of the community of our interests with those doing similar work 
in other* departments. It is to promote interchange of ideas, personal 
contacts, and mutual appreciation of our diverse problems and methods, 
that we have felt the need of a new scientific organisation, in which our 
work may be viewed in a new perspective, not as something extraneous 
and eccentric, a funny sort of botany, for example, or of palaeontology, 
dr Of medicine, but as a tidal movement of our time, which has already 
to refresh and reinforce the means of research id all the biolOgic^i 
sciences.- ■ /• ; v'- 



NEWS AND NOTES 

CANADA —James W. Fisher, Virologist, Laboratory of Hygiene, 
Department of National Health and Welfare, Ottawa, states that they 
are happy in their new quarters. In response to an inquiry regarding'the 
use of statistical tools at the Laboratory, he replies: “The Laboratory 
of Hygiene, being the control laboratory for certain biologicals designed 
for human use in Canada, finds statistical tools to be of inestimable 
value especially for assay work. The graded method has found wide 
application in the Antibiotic Section; while the quantal has justified its 
worth in assaying tetanus and diphtheria toxoids. The aid of the 
statistician has been welcomed in the Bacteriology Section where prob¬ 
lems are encountered that deal with the quantitative estimation of the 
number of microbes in foods, or surviving in the presence of various 
disinfectants. In estimating certain properties of viruses by their 
lethal or other effects produced in animals or eggs, under various experi¬ 
mental conditions, statistical tools have been found to be potent weap- 
ons—the 'methodology’ of the analysis of variance in particular. 
Those in the Virus Section are becoming familiar with probits, standard 
errors, tests of significance, etc., when referring to either in vitro or in 
vivo laboratory procedures. We are using and attempting to develop 
better methods so that our experimental procedures will be more effi¬ 
cient and our results mathematically sound.” . .. Allan Pauli completed 
his Ph.D. in statistics and has returned to the Grain Research Labora¬ 
tory, Board of Grain Commissioners, Winnipeg, Manitoba. His dis¬ 
sertation was “On A Preliminary Test for Pooling Mean Squares in the 
Analysis of Variance,” under the supervision of W. G. Cochran. 

CHINA —C. M. Wang has, at our request, sent us information regard¬ 
ing The Biometry Laboratory in the College of Agriculture, Taiwan 
National University. The Laboratory was established in the fall of 
1946 “(1) for teaching college students and training junior staffs of other 
laboratories and research stations in Taiwan (Formosa) in order to 
rationalise the experiments, to save time and expense in the interval 
from test to practice and thus to prompt and to facilitate the agricul¬ 
tural extension, ® to publish the results of experiments and researches.” 
They have paid attention especially to the statistical methods of small 
samples and the consistency of practice with the underlying theory. The 
staff consists of Director Wang, two lecturers, two assistant teachers 
and seven other assistants. Their curriculum includes courses in biom¬ 
etry, mathematics of statistics, field plot techniques and special lectures 
on problems of biometrical technique. “Study on correlations among 
various characteristics of rice plants since the spring of 1947 and field 
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technique research in rice experiments since the spring of current year 
are being conducted. Experiments with rice, jute and sugar cane are in 
progress cooperating with the Taiwan Province College of Agriculture; 
the Taiwan Agricultural Research Station and the Sugar Cane Research 
Institute of Taiwan Sugar Corporation.” 

ENGLAND — John Wishart is to be a Visiting Professor at the Uni¬ 
versity of North Carolina, Institute of Statistics, Raleigh, during the 
spring quarter of 1949. He will give a course in Experimental Designs. 
Dr. Wishart has been interested in biometric statistics for many years, 
having assisted Karl Pearson and R. A. Fisher in their laboratories 
before going to the School of Agriculture, Cambridge. 

INDIA —The Statistics Section of the Indian Science Congress met 
this year at Patna. A symposium on “crop-forecasts from a study of 
weather conditions” was held. L. A. Ramdas, Agricultural Meteorolo¬ 
gist of Poona Meteorological Observatory, spoke of his experiences about 
the extent of variation in crop-yields due to weather. The other 
speakers discussed the effects of specific meteorological factors on various 
crops ... U. S. Nair, Professor of Statistics, Travancore University, was 
elected the President; Sadasiv Sengupta, Statistical Officer of the East 
India Railway, Recorder of this Statistics Section... In the March 1948 
bulletin of the Calcutta Statistical Association, K. K. Mathen, All-India 
Institute of Hygiene and Public Health, Calcutta has an expository 
article on “Studies on the sampling procedure for a general health sur¬ 
vey” ... A statistical conference was held January 27 to 31, 1948, at 
Singapore. It was presided over by W. Clyde. The countries participat¬ 
ing were: Australia, Burma, French Indo-China, Malaya and British 
North Borneo, Hongkong, India, The Malaya Union, The Netherlands 
East Indies, Siam and Singapore. India was represented by P. V. 
Sukhatme and G. M. Sankpal. The task before the conference was 
essentially of an exploratory character aimed at standardizing rice 
crop statistics and improving the methods of collecting them. 

ITALY — F. Brambilla, of the University “Bocconi” in Milan, held last 
winter a course on “Statistics for Geneticists” at the Zoological Station 
in Naples. The course was attended by members of the staff of the 
Zoological Station, guests and students of the University of Naples . .. 
L. L. Cavalli, of Milan, recipient of a fellowship from the Italian National 
Research Council, is working at the John Innes Horticultural Institution; 
London, with K. Mather on problems of biometry and statistical 
genetics. 
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PALESTINE— Jos. Carmin, Independent Biological Laboratories, 
Kefar-Malal, P. O. Ramatayim writes: “As to statistical aspects of oul' 
research, I am trying my best to have it up to date. In planning any 
research work the statistical aspect is always taken into consideration 
as well as in working out the data.” 


UNITED STATES —The Statistical Section of the Tennessee Public 
Health Association met May 4, at Nashville, Tennessee. Ann Dillon, 
chairman, presided. At the morning session papers were given by 
Eleanor Gorham, Nashville Council of Community Agencies; Parker 
Mauldin, Division of Medical Research Statistics, Veterans Administra¬ 
tion; Jack Moshman, Office of Medical Advisor, Atomic Energy Com¬ 
mission, Oak Ridge; and Helen G. Maher, Tennessee Valley Authority. 
After a luncheon meeting, the speakers at the afternoon session were 
Carolina Randolph, The Commonwealth Fund; WBliam F. Elkin, 
Oak Ridge Health Department; Sara Lou Hatcher, Tennessee Depart¬ 
ment of Public Health; and Mildred Patterson, Sumner County Health 
Department. At a business meeting, the following officers were elected 
for 1948-1949: Chairman, Margaret Martin, Vanderbilt University, 
Department erf Preventive Medicine and Public Health; Vice-Chairman, 
William F. Elkin; and Secretary, Sara Lou Hatcher ... Alexander M. 
Mood and Arthur J. Brown, members of the staff of the Statistical 


Laboratory, Iowa State College, sare on leave with Project Rand, Santa 
Mpmca, Gtffifornia... The Virginia Academy of Science meetings were 
held.May 6 to 8, at Hotel Roanoke. At the sessions of the Section on 
Statistics, A.E. Brandt gave talks on “Some statistical aspeets of experi- 
na^taticn^ and “On the correlation of a part /with the whole.” The 
1 difBeera fcr this section are Boyd Haishbaiger, Chairman; W. H. White, 

. Mr: Harshbarger 




£ jsaMtem SfTSMBjRSa 


the Statistical' toniverfSty of 

, formerly Reader ■ in Mathematics, Hr^wood Ckffi^e/ University erf 
London, appointed Professor of Mathematics and Research Associate 
in the Statistical Laboratory; Charles M. Stein promoted to Assistant 


and Research Associate; Elizabeth L. Scott promotedto 


Lecturer and Research Assistant; Edith Mourier, formerly Research 
Assistant at the Institute Henri Pomear6, Paris, appointed . Research 
Assistant and Teaching Assistant at the Statistical Laboratory; and 
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SYSTEMATIC AND RANDOM SAMPLING FOR ESTIMATING 
EGG PRODUCTION IN POULTRY* 

A. W. Nokdskog and S. Lee Chump 


Iowa State Collegef 


INTRODUCTION 


T he problem of determining the ideal period for incomplete trap- 
nesting to estimate egg production has been investigated during 
the last thirty years by several workers. Thompson (1933), Olson (1939) 
and Hays (1946) have given good reviews of the subject. 

If the heritability of egg production were high, mass selection of 
breeding stock would be an efficient method of attaining genetic im¬ 
provement (Lush, 1946) and the accuracy of the production records for 
individual hens would be of primary importance. Daily trapnesting 
would provide accurate individual records. That mass selection is not 
an efficient method in breeding for improved egg production has long 
been recognized (Gowell, 1903). Lemer and Hazel (1947) estimated 
that the heritability of egg production based on individual hens was 
about 5 per cent in the flock that they studied. 1 

Greater breeding progress is possible by selection of sire-progeny 
and family groups than by simple mass selection. Thus, the accuracy 
of the average record for the group is most important, while that for 
individual Records is secondary. Errors in individual records are only 
one source of error in the average record for the group. The other im¬ 
portant source of error is the variation caused by true differences in, 
productive ability among the individuals of a family or sire-progeny 
^©Up. ' 

^jfethe development and testing of inbred lines and crosses among 
them, it is again the accuracy of the record for the group which is^i#’ 
primary importance. Since trapnesting is costly, there is need for ah 
accurate, assessment of the relative sizes of the errors in group records 
arising from various sources. It is the purpose of this paper to repbrt 
fee relative sizes of errors arising from incomplete trapnesting -of 
Individual hens, and from variation among fee hens of a group, under 
several schemes of incomplete trapnesting. . . 
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THE DATA 

The data include the complete trapnest records for 20 top-cross White 
Leghorns (inbred males X non-inbred females), selected at random, 
from each of five sire-progeny groups. A restriction imposed on the 
data was that only those hens finishing their first laying year were 
selected (1940-41). All the hens were housed together in two adjoining 
pens, each containing about 150 birds. The average production for the 
100 hens used was 173 eggs per hen per year. 

Because of the seasonal fluctuation in egg production, any system of 
incomplete trapnesting to estimate total annual egg production should 
extend over the whole year. One obvious way of guaranteeing this is to 
consider only systems which include at least one day in each month of 
the year. Accordingly, the complete record for each hen was divided into 
28 basic sampling units formed as follows: the first basic sampling unit 
includes the record for the first of each month, the second includes the 
record for the second of each month, etc. For convenience, only the 
first 28 days of each month were used. The results which would have 
been obtained using the full monthly record for each hen would not 
differ materially from those reported here. It should be noted that each 
of the basic sampling units is itself a systematic sample of the days of 
the year as defined by Madow (1946). Now, if the basic sampling units 
for each hen are denoted by 

#1 3 %2 ) "* j #28 

the systems of incomplete trapnesting considered in this paper may be 
defined as follows: 

A Interval-day trapnesting. The trapnesting days are spaced at 
regular intervals within the months. There are d a = 28/a possible 
interval-day samples of size a (i.e. which include a of the basic samp¬ 
ling units). Let S k (a ) denote the kih interval-day sample of size 
a, k = 1 , 2, • • • , d a . Then S k (a) consists of the following basic 
sampling units: 

%k f %k+d a J %k+2d a ) * " * 3 *^i+(a-l)d« • 

To obtain an interval-day sample of size a, one of the samples 
Si(a) } S 2 (a), * * * , S da (a) is selected at random. 

To illustrate, when trapnesting is to be carried out on a = 4 days 
per month there are d 4 = 28/4 = 7 possible interval-day samples. 
Two of the possible samples are made up as follows: 



. ESTIMATING EGG PRODUCTION IN POULTRY 


225 


&(4}.: x 2 , x Q , ®i 6 , z 23 
&(4) : z 5 , a?i a , x X9 , z 23 

B Consecutive-day trapnesting. The trapnesting days within each 
month are consecutive. As in the preceding case there are d a = 28 /a 
possible consecutive-day samples of size a. Let Sl(a) denote the 
Zrth consecutive-day sample of size a, k = 1, 2, • • • , d* . Then 
Sk(a) consists of the following basic sampling units: 

X a (h- 1)+1 ) Xa(k-l)+2 > j #a(fc-l) + a * 

To obtain a consecutive-day sample of size a, one of the samples 
/Si(a), 82 (a), • • • , 81(a) is selected at random. 

• Using again the previous illustration, there are 7 possible consecu¬ 
tive-day samples. Two of the samples are made up as follows: 

£ 3 ( 4 ) : x 9 , x 10 , x n , X 12 
8 }( 4) : x 25 , x 2Q , x 2 7 , x 28 

C Random-day trapnesting. The trapnesting days within each month 
are randomly distributed. To obtain a random-day sample of size 
a , a of the samples 

m s.(u, ••• ,& 8 (i) 

are selected at random. The samples &(1), jS 2 (1) • • • , & 28 ( 1) are 
of course just the basic sampling units. 

When a = 1 the three systems are identical. Since each basic 
sampling unit includes one day in each month the total number of 
trapnest days in any sample of size a is 12a. In the present discus¬ 
sion a takes the values 1, 2, 4, 7, and 14. These values are conven¬ 
ient since they are the factors of 28. 

THEORY 

The following discussion will refer to interval-day trapnesting. With 
appropriate changes in notation it applies also to consecutive-day 
trapnesting. 

Let yuh(a) denote the egg production in the sample, S k (a) } (adjusted 
to an estimate of the total yearly production) for the jth hen in the ith 
group. Then the model expressing the egg production in this sample is 
given by the following equations: 
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Vu k(a) = M + Qi + ha + s k(o') + r ik (a) + e iik (a), 

1, 2, ••• , j> 

i = 1, 2, • • • , g 
fe = 1, 2, ••• , i a 

where: J 

1. pl = the true average yearly production per hen over all p groups. 

2. g { = the deviation of the true mean of the ith group from p. 

3. ha = the deviation of the production of the jth hen in the ith 

group from p + . The h if are a random sample from an infinite 

population with mean zero and variance cl . 

4. s k (a) = the deviation of the production in sample S k (a), aver¬ 
aged over all pq hens, from p, These deviations comprise a 
finite population with mean zero and variance, c](a) = 1 /d a 

E«!(«). 

5. r lJt (a) = the deviation of the production in sample S k {a) ) 
averaged over the q hens in the ith group, from p + g, +. s k (a). 
For each i these deviations comprise a finite population with 
mean zero and variance c 2 r (a) = l/d a ^ r 2 k (a). 

k 

6. e xik {a) = the deviation of the production in S k (a) for the jth 
hen in the ith group from p + g x + h ti . These deviations are a 
random sample from an infinite population with mean zero and 
variance or, subject to the restrictions ^2 e ijk {oL) = 0. 

Now assume that a single interval-day sample of size a is selected 
for q hens in each of two groups. Let S k (a) be the sample selected. Then 
the difference in yearly production between the two groups, g x — q* , is 
estimated by 

D(a) = y x .*(«) - y a . k (pt) 

where 

Vh-k(a) = ^ E Vate), h = \,2. 

2 i 

The sampling variance of D(a) is 

V[D(a)] = 2 ^ Wl + <£(<*)] + <*«)}. 
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If two interval-day samples of size a are drawn independently, the 
first being used for the q hens in the first group, and the second for the q 
hens in the second group, then g x — g 2 is estimated by 

A (a) = Vi-kia) — V2-i(a) 

where the kth and Zth samples are the ones selected for the first and 
second groups respectively. In this case the sampling variance is 

Vm*)] = 2<g [d + a 2 .(a)] + d(a) + 

If a single "random-day sample of size a were selected for q hens in 
each of two groups, letting 5 *,( 1 ), S k ,( 1 ), • • • , S k „(a) denote the sample 
selected, then the group difference is estimated by 


D"( a ) = y'M - y'M 

where 

$•*(<*) = VmS 1 ). h = 1,2. 

gpL j m-I 

The sampling variance of D"{a) is 

V[D"(cc)] = 2 ^ [cl + *&)] + 

If a different random-day sample is independently drawn for each group, 
0 ! — g 2 is estimated by 

£('(«) = vU«r - VM” 

■with sampling variance 

V[D['{a)} = 2^ [cl + 1 )] + [<r ' (1) + <r * (1)] 

where /S u (l), jS u ( 1 ), • • • , S la (l) is the sample drawn for the second 
group. 

It will be convenient to compare the three trapnest sampling schemes 
in terms of the sampling variances of estimates of group differences. In 
order to estimate these sampling variances it is necessary to estimate 
the variance components cl , cl{a), cl (a) and cl (a) for interval-day and 
consecutive-day samples when a = 1 , 2 ,4, 7,14. (The variance compo¬ 
nents and their estimates will be “primed” for consecutive-day samples.) 
These estimates may be computed easily from an analysis of sums of 
squares. Table 1 shows the appropriate analysis for any value of a for 



TABLE 1 

THE ANALYSIS OF THE SUMS OF SQUARES FOR d a SAMPLES OF SIZE a FROM EACH OF g HENS IN EACH OF p GROUPS. 



(The a-identification in the y’s has been dropped for convenience.) 
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interval-day samples. The analysis for consecutive-day samples is 
analogous, and for a =1 is, of course, identical. The last column of 
Table 1 contains the expectations of the sums of squares. These expecta¬ 
tions may be verified by reference to Daniels (1939) and Crump (1946). 

Estimates of the variance components are obtained by setting the 
sums of squares of a given analysis equal to their expectations with s 2 
substituted for <r 2 , and solving the resulting set of linear equations for 
si , sl(a), s 2 r (a) and si (a). The estimates of the <r 2, s, then, are the corre¬ 
sponding $ 2, s. 

* 

RESULTS 

In order to clarify the method of estimating the variance components 
outlined in the preceding section, the numerical analysis of the sums of 
squares for a = 1 is shown in Table 2. 


TABLE 2 

THE ANALYSIS OF THE SUMS OF SQUARES FOR a « 1 


Source of 
Variation 

Degrees of 
Freedom 

Sums of 
Squares 

Expectations of the 

Sums of Squares 

Groups 

4 

543,419 

4-28<t 2 a + 20-28 2 g 2 > 

Samples ’ 

27 

130,027 

28^(1) +20 • 28<rv(l) +5 • 20 • 28<r ! ,(l) 

Hens within groups 

95 

3,642,188 

5-19-28^ 

Samples X Groups 

108 

224,387 

4-28« ,2 ,(l) + 4-20-28^(1) 

Samples X Hens 




within groups 

2565 

5,245,953 

5-19-28^(1) 


The sums of squares and their expectations constitute a set of linear 
equations for estimating the variance components. Solving these equa¬ 
tions gives the following estimates: 

si _ 1369.24 
$ 2 (1) = 1972.16 
s 2 (l) = 26.40 

$ 2 (1) = 1.56 

The component arising from variation among hens, <rl , is clearly 
independent of the sampling scheme and of the size of the sample. 
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Hence, the estimate st will remain the same in all analyses of the sums of 
squares. The variation arising from the difference in the behavior of a 
given sampling scheme over the groups is measured by ov(a)( o-' r 2 (a) in 
the case of consecutive-day sampling). On intuitive grounds it is not 
expected that this variation will be large. For a = 1 s 2 r (l) = 1.56, is 
comparatively very small. For other values of a both $ 2 r (a) and s' 2 (a) 
are also small, and in the remainder of the discussion <r 2 r (a) and <r' r 2 (a) 
are neglected. 

Table 3 gives the estimates of the variance components, a!(a), s 2 (a), 
s' 2 (a) and si 2 (a). 


TABLE 3 

ESTIMATES OF THE VARIANCE COMPONENTS 


Sample Size 

Interval-day Samples J 

Consecutive-day Samples 

a 


S 2 *(a) 


«'*.(«) 

2 

923.7 

6.1 

862.1 

23.6 

4 

430.8 

0.0 

333.7 

19.6 

7 

244.0 

0.0 

195.5 

13.3 

14 

98.5 

0.0 

74.0 

12.4 


It is to be noted that for all values of a i,{a) > s' 2 (a) and $*(a) < 
s' 2 ,(a). Both of these inequalities indicate that the differences among 
interval-day samples are less consistent than for those taken on consecu¬ 
tive days. This fact is not entirely unexpected, and is reflective of cyclic 
changes in egg production within months. Thus, its influence will be 
more apparent when samples are taken on consecutive days. 

• Table 4 shows the sampling errors of the difference between average 
yearly production for two groups of 20 birds each sampled on the same 
days, and sampled on different days. 

The most notable feature about Table 4 is the apparent uniformity 
within the lines of the table. The differences between the 3 sampling 
schemes are small when groups of size 20 are to be compared. It is 
interesting to observe that for groups trapnested on the same days the 
consecutive-day scheme has the lowest sampling error for all values of 
a , but when groups are trapnested on different days it has the highest 
for all values of a . This results from the two inequalities mentioned in 
connection with Table 3. There is only a small loss in accuracy in 
trapnesting 14 days per month under any of the sampling schemes 
compared with trapnesting 28 days per month. The former shows an 
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# - TABLE 4 

SAMPLING ERRORS OF THE DIFFERENCE BETWEEN THE AVERAGE YEARLY 
PRODUCTION OF TWO GROUPS OF 20 BIRDS EACH FOR DIFFERENT SAMPLING 

SCHEMES 


(Results Are in Number of Eggs per Hen per Year) 


Sample size 

ct 

Groups trapnested on same days 

Groups trapnested on different days 

Interval 

days 

Consecutive 

days 

Random 

days 

Interval 

days 

Consecutive 

days 

Random 

days 

1 

18.3 


18.3 

19.0 


19.0 

2 

15.1 

14.9 

15.2 

15.3 

15.7 

15.6 

4 

13.4 

13.0 

13.4 

13.4 

13.8 

13.7 

7 

12.7 

12.5 

12.6 

12.7 

13,0 

12,7 

14 

12.1 

12.0 

12.0 

12.1 

12,5 

12.1 

28 


11.7 






approximate sampling error of 12.1 eggs compared with 11.7 eggs for the 
latter. Also, there is only a difference of about 1-1/2 eggs in sampling 
error between trapnesting 14 days and 4 days per month. It is evident 
that the error resulting from incomplete trapnesting is of minor im¬ 
portance when groups as large as 20 are being compared. The relative 
importance of group size (number of birds) and the completeness of 
trapnesting in reducing sampling variance will be considered in the next 
section. 

PRACTICAL APPLICATION 

Hays (1946) states that daily trapnesting adds about one dollar per 
year to the cost of keeping a hen. The trend in present poultry breeding 
operations to minimize the importance of high individual hen records 
and correspondingly to lay greater stress on the average production of 
families of full sibs or half-sibs makes it desirable to consider jointly the 
error resulting from incomplete trapnesting and limited group size. 
Both affect the accuracy of a family average. With a knowledge of the 
error expected from these two sources it is possible to estimate the level 
of accuracy for any combination of number of days of trapnesting and 
flock size. 

Figure 1 shows graphically the percent increase in group or flock size 
that will just offset the errors resulting from incomplete sampling when 
trapnesting is conducted on the same days for all hens in each group. 
The required increase in flock size is proportional to 

vl 
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FIGURE 1 



The per cent increase in number of birds per group that will offset the loss in accuracy d^e to incomplete 

trapnesting. 


for any given system of trapnesting. From the graph it is seen that the 
accuracy lost by trapnesting 14 days per month compared with complete 
trapnesting could be recovered by increasing group size by only 7 per¬ 
cent. Increasing group size by 25 percent would make it necessary to 
trapnest only 4 days per month, or 1/7 as much. Thus, groups such as 
full-sibs averaging 8 in number and trapnested daily would be no more 
reliable than groups of 10 trapnested only 4 days per month. For testing 
large groups such as the progeny of a sire where groups may contain as 
many as 30 birds, the results show that 32,35,39 and 50 birds trapnested 
14, 7, 4 and 2 days per month, respectively, would give production 
records as accurate as those obtained from 30 birds trapnested daily. 
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SUMMARY 

One hundred first year egg production records were divided into 
partial records of 1, 2, 4, 7 and 14 days per month to correspond to 
different degrees of incomplete trapnesting. Three methods of incom¬ 
plete trapnesting (sampling) were considered: (1) Interval-day—each 
trapnest day spaced at regular intervals throughout the month, (2) con¬ 
secutive-day—-the trapnest days were taken consecutively, and (3) 
random-day trapnesting. For each size of sample and method of samp¬ 
ling the total variance in egg production was separated into three com¬ 
ponents: hens, sampling days, and remainder. From these, sampling 
errors were estimated for the different methods and size of samples. 

The differences in accuracy among the three methods are small. 
Interval-day trapnesting may be slightly more accurate when groups of 
birds to be compared are trapnested on different days, but when trap¬ 
nesting is conducted on the same days consecutive-day sampling is 
slightly more accurate. The standard error of the difference of two 
groups of 20 birds is about 11.7 eggs per hen per year under complete 
trapnesting. The standard error increases by about .4 of an egg when 
trapnesting is reduced to a half-time basis and by 1.5 eggs when the 
amount of trapnesting is reduced to 1/7 of the time. The accuracy lost 
by trapnesting 14 days per month could be recovered by increasing 
group size only 7 per cent, while that for trapnesting only 4 days per 
month could be recovered by increasing group size 25 per cent. 
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SAMPLING ALYCE CLOVER FOR CHEMICAL ANALYSES 1 

J. A. Rigney and R. E. Blaser 2 

Alyce clover (Alysicarpus vaginalis (L.) D.C.) is a s umm er 
annual legume used i or hay and pasture in Florida. Since the mois¬ 
ture and fertility requirements of the plant were not known, several 
tests were initiated to study its adaptation. The effects of various 
fertilizer treatments on yield and herbage composition have 
already been published [1]. 

During the progress of these experiments, the problem arose 
as to optimum techniques for sampling the plots for chemical 
evaluation. In the absence of sufficient information on the problem, 
sampling data were collected on one experiment and these data 
form the subject of this paper. 


METHODS 


T he experiment was located near Gainesville, Florida, on a reason¬ 
ably uniform field of Norfolk fine sand. Ten fertilizer treatments 
were randomized in each of three blocks. The plots were 10 x 30 feet and 
were arranged within the blocks so as to keep the between plot varia¬ 
bility at a mini-mum. 

The fertilizer materials were broadcast uniformly and then disked 
once to a depth of about four inches by running a nine-foot disk length¬ 
wise through each plot. When cut for hay, the plants varied in height 
from 24 to 36 inches. Samples for chemical analyses were taken from 
the plots before mowing when about one-fourth of the flowers were open. 

Sampling data were taken on only five of the treatments in the three 
blocks. Two people independently obtained herbage samples from each 
of the fifteen plots. Each plot was sampled by following a zig-zag path 
lengthwise through the plot. A “grab sample” was taken at the end of 
alternate paces, making a total of 12 to 15 “grabs” per plot. The plants 
in each “grab” were cut approximately 3 inches above ground with a 
hand sickle. The green weight of the plant material per plot averaged 
5.5 pounds. 

The samples were dried in an oven a,t 70° C. and ground. After 
thorough mixing, a subsample of the material was placed in a pint jar. 

The chemical analyses were made in the Agronomy Department 
laboratory at the University of Florida. Two sub-samples were drawn 


iJoint contribution from tne Institute of Statistics of The University of North Carolina, Raleigh, 
the Florida Agricultural Experiment.Station, Journal Paper No. 298. 

spiant Science Statistician, Institute of Statistics and Professor of Agronomy, Cornell University 
formerly Agronomist with the Florida Agricultural Experiment Station. 
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from each jar for analysis. Hence there were four separate ashings and 
analyses from each plot, i.e., two sub-samples from each of two plot 
samples. Standard chemical analytical techniques were used to de¬ 
termine the percentage P, K, Ca and Mg in the samples. 

RESULTS AND DISCUSSION 

The analyses of variance for the four constituents are shown in the 
upper part of table I. Only those sources of variation which measure 


TABLE I 

Analyses of Variance of Data on Phosphorus, Potassium, 
Calcium and Magnesium Contents of Alyce Clover 


Source of 
variation 

d.f. 

Mean Squares 

Composition of 
of M.S. 

P 

K 

Ca 

Mg 

Replication 

2 

.004,399 

.039,655 

.006,625 

.009,592 


Treatments 

4 

.000,429 

.103,821 

.025,075 

.010,489 


Exp. error 

8 

.000,866 

.107,729 

.033,069 

.003,675 

Vi + 2V. + 4V P 

Samples in plots, 

15 

.000,239 

.041,481 

.018,827 

.033,615 

Vi + 2F. 

Determinations 

30 

.000,007 

.000,306 

.000,169 

.000,328 

Vi 


Estimates of Variance Components 


Plots (7p) 

.000,157 

.016,562 

.003,560 

.000,015 


Samples (V«) 
Determinations 

.000,116 

.020,588 

.009,329 

.001,644 



.000,007 

.000,306 

.000,169 

.000,328 



sampling variances are of interest in this paper, but mean squares for 
replication and treatments are included to indicate the complete analysis. 
The last column of the upper portion indicates the composition of the 
mean squares that are of interest. The estimates of the variance com¬ 
ponents given in the lower part of the tables were computed from the 
respective mean squares as indicated by their algebraic compositions. 

A comparison of the estimated variances indicates that the laboratory 
technique of sub-sampling the ground plant material and making the 
actual determinations was satisfactory, since the standard error for 
determinations was less than 5% in all cases. The variation between 
successive samples from the same plot was relatively large and even 
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exceeded plot-to-plot variation for K, Ca and Mg. Plot variance for 
Mg is unusually low compared to that found in other studies. 

The variance of a treatment mean may be conveniently indicated as 



where V P , V , and V d are the estimated true variances due to plots, 
samples in plots and determinations, and p, s and d are the number of 
plots, samples and determinations per treatment, respectively. It has 
been shown [2] that this form is useful in evaluating composites of several 
samples from one plot or composites of several plots. Table II gives 
the variance of a treatment mean for different numbers of plots and 


TABLE II 

The Effect of Altering the Sampling Scheme on 
the Accuracy and Cost of Treatment Means 


Number 
plots per 
treatment 

Number 
samples 
per plot 

Number 
detns. 
per plot 

Variance 
of treat, 
mean (V*) 

Confidence 

interval 

Relative 
cost per 
treatment 

Cost per 
unit In¬ 
formation 
(X100) 

3 

1 

1 

Phosphorus 

.000,093 

.022 

254 

2.36 

3 

1 

2 

.000,092 

.022 

344 

3.16 

3 

1 

3 

.000,092 

.022 

434 

3.99 

3 

2 

l 1 

.000,073 

.020 

275 

2.00 

3 

3 

! l 1 

i 

.000,067 

.019 

296 

1.98 

6 

1 

l 

.000,046 

.016 

485 

2.23 

6 

2 

i l 

.000,037 

.014 

547 

2.02 

3 8 

2 

4 

.000,072 

.020 

545 

3.92 


Single determinations on composites of two or three samples. 
*This is the scheme actually used in the study. 


samples and determinations per plot for only phosphorus since it 
will illustrate the methods involved. The first three lines of the table 
show that increasing the number of determinations has little effect in 
reducing the sample variance. This is equivalent to saying that further 
refinement of the laboratory technique would be of little value under 
these conditions. However, taking more samples per plot is quite effec- 
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tive in improving the accuracy (lines 3 and 4, table II). For example, 
the variance of the mean was reduced from .000093 to .000073 by taking 
two samples per plot and making a single determination on the compos¬ 
ite. A third sample per plot was not as effective as the second. It 
should be emphasized here that taking an additional sample implies 
making a new randomly determined trip through the plot and collecting 
“12-15 grabs”. It is not possible to estimate from these data the 
effects of increasing the number of “grabs” in the initial trip. It should 
also be pointed out that these and following remarks assume that errors 
of subsampling the ground material will be unaffected by the number of 
samples composited. This seems to be a reasonable assumption at least 
for a limited range of samples if the material is thoroughly mixed. 

Another method of reducing the sample variance would be to increase 
the number of replications of the entire experiment. Doubling the num¬ 
ber of replications but taking the same number of samples and determina¬ 
tions per plot would of course double the accuracy of a treatment mean. 
However, increasing the number of plots is much more expensive than 
taking more samples per plot. Therefore, where the individual plots 
are poorly sampled, this may not prove to be the most efficient way to 
improve the technique, as shown below. 

The fifth column of Table II shows the confidence interval for each 
sampling procedure. This value is computed as t ( , 0 s)(F f ) 1/2 and is 
interpreted as the interval on either side of the observed mean having a 
95% probability of including the true value for that treatment. Thus, 
under these conditions, by making a single determination on a composite 
of two samples in each of three replications, one might confidently (95%) 
expect to be within 0.020 of the true phosphorus percentage for a given 
treatment. 

Final decision as to the best method of reducing the variance of a 
treatment mean must include some notion of cost. Such an approach 
has been widely used by agricultural economists, but few attempts have 
been reported by agronomists. Several forage crops investigators sup¬ 
plied the writers with estimates of the relative costs of the three main 
procedures involved in this study. The estimates were given as the cost 
of the initial unit (plot, sample or determinations) and the relative cost 
of each additional unit. Average estimates were as follows: 


Relative Cost of 



Initial 

Additional 

Plots. 

... 60 

40 

Samples. 

... 10 

7 

Determinations . . . 

... 30 

30 
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From such estimates, the relative cost of any sampling scheme may 
be evaluated by setting cost = 60 + 40 (p — 1) + 10+ 7(s — 1) + 30 d. 
Costs so derived are given in column 6 of table II. 

The relative amount of information that is supplied by a particular 
mean is inversely proportional to its variance. Therefore cost/fl/T*) 
gives the relative cost per unit information. The sampling scheme that 
gives the lowest cost per unit information may be considered the most 
efficient. Some cost comparisons are shown in the last column of table 
II. In general these data indicate that increasing the number of' de¬ 
terminations per plot is expensive when the sampling technique and the 
plot-to-plot variations are so large. Improving the sampling of the 
individual plots was desirable up to two or three samples per plot. 

If the above cost function is set up for a fixed degree of accuracy 
(7*) it can be minimized to give the optimum proportion of plots to 
samples to determinations. Table III shows these ratios derived for the 
four constituents studied. 


TABLE III 

Optimum Ratio of Number of Plots, Samples and 
Determinations per Treatment 



P 

K 

Ca 

Mg 

Plots 

4 

6 

4 

1 

Samples 

8 

16 

15 

26 

Determinations 

1 

1 

1 

6 


For phosphorus, a single determination on the composite of 8 samples, 
two from each of four plots, would result in a minimum cost per unit 
information. The other constituents give similar ratios except Mg 
which had an unusually low plot variance. 

In practice, it would not be desirable to composite all replications as 
indicated in table III since no estimate of error would be available. 
However, if a large number of treatments is being tested in as many as 
four replications, it might be desirable to composite pairs of replications. 
Since it would be desirable to have a fair estimate of experimental error, 
a reasonable criterion for deciding how much compositing of replications 
could be done would be that the degrees of freedom for error did not 
fall below 15. The values for cost per unit information in table II indi¬ 
cated that the number of plots and samples per plot could be varied 
considerably without deviating seriously from the lowest cost. However, 
the number of determinations must be limited more carefully due to the 
high cost and low variation associated with the chemical analyses. 
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There are some apparent inconsistencies between the ratios in table 
III and those of table II. For example, in table II three samples per 
plot was slightly more efficient than two for phosphorus but the above 
ratios indicate that two is optimum. This disparity arises from the 
fact that more than the optimum number of determinations were used 
in table II. If the number of determinations is set equal to the number 
of plots, the optimum number of samples per plot becomes 2.6. 

Potassium was more variable than the other constituents although 
there was no apparent reason for this. A single determination on a 
composite of two samples from each of three plots would give a confidence 
interval that was 25% of the over-all mean for K, compared to 9%, 
14% and 14% for P, Ca and Mg, respectively. If it were desirable to 
reduce the confidence interval of K to 10% of the general mean, V ± 
would need to be reduced to .00139. According to the ratios of table III 
the optimum technique would require 19 plots, 50 samples and 3 de¬ 
terminations. While this number of plots and samples seems absurdly 
high, it illustrates the difficulty of obtaining a high degree of accuracy 
with such variable material. 

SUMMARY AND CONCLUSION 

Duplicate determinations on each of two field samples from three 
replications of five fertilizer treatments on Alyce Clover provided data 
for estimating variances due to plots, samples and chemical determina¬ 
tions. These estimates were obtained for percentage P, K, Ca and Mg 
in the clover hay. The accuracy of treatment means involving different 
numbers of plots, samples and determinations was examined. Relative 
costs for the three phases of the procedure were estimated and the cost 
per unit information computed for the various schemes under study. 
The optimum ratio of plots: samples: determinations was calculated for 
a constant variance of the mean. 

In general, the relatively high cost and low variance of the laboratory 
determinations require that this part of the technique be reduced to a 
minimum. The optimum ratio of total samples to total determinations 
per treatment varied from 4 for Mg to 16 for K. Except for the unusu¬ 
ally low plot-to-plot variance of Mg, the optimum number of samples 
per plot ranged from 2 for P to 4 for Ca. 
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THE ANALYSIS OF COVARIANCE AND 
NON-ORTHOGONAL COMPARISONS 


M. H. Qtjenotjille 
Marischal College , Aberdeen, Scotland 


INTRODUCTION 

Tj 1 Yates [1] has defined orthogonality as that property of a design 

'*■ # which ensures that the different classes of effects to which the 
experimental material is subject shall be capable of direct and separate 
estimation without any entanglement.) Obviously orthogonality is a 
property to be desired in any design, but unfortunately the design of 
experiment cannot always be determined prior to the commencement of 
an experiment, while experiments which are planned as orthogonal are 
frequently ‘confounded’ by extraneous causes. Yates, for example, 
considered an experiment in the growth of chickens on three different 
diets. Because of the difficulty of determining the sex at hatching, the 
proportions of cockerels in the three groups will usually vary, and since 
cockerels grow faster than pullets, the effect of sex must be taken into 
account if the comparisons between diets are to be accurate. 

Commonly the method of least squares is employed in the analysis 
of non-orthogonal comparisons, but the labour involved in solving 
several simultaneous equations and estimating, in R. A. Fisher’s nota¬ 
tion [2] the elements c {i of the inverse matrix is frequently large. The 
purpose of this note is to show how when the deviations from ortho¬ 
gonality are small, it is often possible to carry out this calculation using 
the analysis of covariance thus reducing the algebraic procedure of 
solving the normal least squares equations to the arithmetical procedure 
of the analysis of covariance. The normal least squares equations give 
rise to direct solutions when all comparisons are orthogonal and while a 
slight deviation from orthogonality will frequently yield a set of appa¬ 
rently difficult equations, the analysis of covariance reduces these 
equations with the minimum of calculation. 
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THE BASIS OF THE METHOD 

The method is in fact an extension of that suggested by M. S. Bart¬ 
lett [3] for the simplest case of non-orthogonality, namely one missing 
observation. Bartlett suggested that the analysis could be carried out 
in the normal manner with the missing observation replaced by zero 
(or any other convenient value), if a covariance analysis was simultane¬ 
ously carried out on a second set of observations in which the value 
corresponding to the missing observation might be taken as one, and 
all other observations as zero. This is quite a neat method of estimating 
and adjusting for the missing observation, and as Bartlett pointed out, 
it can be used to compensate for several missing values, although the 
task becomes more arduous as the number of missing values is increased 
and the degree of orthogonality is decreased. However this same method 
is useful for other slight deviations from orthogonality, as is demon¬ 
strated by the following examples. 


METHOD OF ANALYSIS 

Consider the experiment on the growth of chickens given by Yates. 
The total bird weights are given in Table 1, and the number of birds in 
each group is indicated in brackets. 


TABLE 1 

Total Bibd Weights 


Treatment 

A 

B 

C 

Total 

Cockerels 

14.10 (5) 

22.50 (9) 

33.00 (12) 

69.60 (26) 

Pullets 

19.90 (10) 

10.98 (6) 

5.46 (3) 

36.34 (19) 

Total 

34.00 (15) 

34.48 (15) 

38.46 (15) 

105.94 (45) 


If a pseudo-variate of one is used for each of the cockerels and zero for 
each of the pullets, the analysis of covariance may be set out as in Table 2. 
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TABLE 2 

Analysis of Covakiance 


■ 

d.f. 

1 

s.s. 

i 

s.p. 

Sum of squares 
of 

pseudo-variate 

Treatments 

2 

0.9992 

0.9795 

1.6444 

Error 

42 

9.0534 

‘7.4107 

9.3334 

Total 

44 

10.0526 

8.3902 

10.9778 


. 

Regression 

Deviations from regression 


d.f. 

s.s. 

d.f. 

S.S. 

Treatments 



2 

0.4708 

Error 

1 

5.8841 

41 

3.1693 

Treatments -f- 





Error 

1 

6.4125 

43 

3.6401 


However this analysis may be alternatively set out as in Table 3. 


TABLE 3 



d.f. 

s.s. 

Sex 

1 

6.4125 

Total (eliminating sex) 

43 

. 

3.6401 

Total 

* 44 

.1 

10.0526 

Sex (eliminating treatments) 

1 

5.8841 

Error (eliminating sex and treatments) 

41 

3.1693 

Error (eliminating treatments) 

42 

9,0534 

Treatments (eliminating sex) 

2 

0.4708 

Error 

41 

3.1693 

Total (eliminating sex) 

43 

3.6401 
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This analysis, apart from rounding-off errors, agrees with the analysis 
given by Yates, although in order to complete the analysis the interac¬ 
tion of sex and treatments must be tested by removing the joint effect 
of sex and treatments, given by treatments + sex (eliminating treat¬ 
ments), from the between class sum of squares. This has been done in 
Table 4. 


TABLE 4 

Completed Analysis 



di. 

s.s. 

m.s. 

Treatments (eliminating sex) 

2 

0.4708 

0.2354 

Sex (eliminating treatments) 

1 

5.8841 

5.8841 

Sex and Treatments 




Interaction 



0.0520 

Between classes 

5 

6.9872 


Error 

39 

3.0654 

0.0786 

Total 

44 

10.0526 



The interaction sum of squares in this case is negligible so that the basic 
assumption of an equal effect of sex in the three treatment groups is 
justified, but if the interaction was not negligible, further analysis would 
be necessary. This might be carried out by using three pseudo-variates, 
a, bj and c, which take values one for each of the cockerel treatment 
groups, in turn, and 'zero elsewhere, so that treatments (eliminating sex 
and sex-interaction), and sex and sex-interaction (eliminating treat¬ 
ments) are estimated. This analysis, although lengthy, is shortened by 
the fact that the error terms in the sums of products of the pseudo¬ 
variates are all zero so that each variate acts independently. 1 

However this is not true for the totals so that it becomes necessary 
either to solve three simultaneous equations, or to carry out an analysis 


iCross-cheoks are also provided by adding the covariance analysis of the pseudo-vanates with the 
observations to obtain the analysis of covariance given in Table 2. 
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on sex eliminating the effect of treatments and their interactions by 
using four pseudo-variates. This latter method is exactly equivalent 
to the method of weighted squares of means. 

The test of the difference between any pair or group of treatments 
can be carried out by the formulae and methods of the analysis of covari¬ 
ance in the usual manner. For example, the adjusted mean difference 
between treatments 1 and 2 is 

h [84.00 - 33.48 + 4 X |gg] -0.2464 

and its variance is 

°-W [ 16 + 15 + dli] - °- 1108 

A second example of the same method is provided by the experiment 
given in Table 5. This experiment was designed to test nine treatments 
in three blocks, but a large and unaccountable trend showed across the 
blocks. 2 


TABLE 5 



Treat¬ 

ment 



j 

Yield, y \ 

Treat¬ 

ment 

Yield, y 

Total 


8 

4.6 

7 

6.4 

4 

11.4 



2 

4.5 

9 

9.5 

1 

11.1 



8 

9.2 

5 

11.4 

6 

12.9 

81.0 

2 

9 

9.1 

4 

11,5 

8 

15. i 



7 

8.3 

6 

13.2 

3 

11.3 



2 

5.3 

1 

11.3 

5 

12.0 

97.1 

3 

5 

4.8 

7 

13.7 

2 

10.3 



6 

3.4 

8 

11.9 

9 

12.9 



3 

2.6 

1 

8.6 

4 

11.2 

79.4 

Total 


51.8 


97.5 

: 




sThis example should be compared with that given by Cochran [4], 
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Although there is some doubt as to whether the conditions of the 
analysis of variance are satisfied it is interesting to see how this may be 
analysed using pseudo-variates x x , which takes values one in column 1 
and zero elsewhere, and x 2 , which takes values one in column 2 and zero 
elsewhere. The analysis of covariance for this design is given in Table 6. 


TABLE 6 



d.f. 

y 2 

s.s. 

yx i 
s.p. 

yx 2 

s.p. 

Xi 2 

s.s. 

X1X2 

s.p. 

Z 2 2 

S.S. 

Blocks 

2 

21.30 

0.00 

0.00 

0.00 



Treatments 

8 

92.49 

—8.83 

6.94 

1.33 

1 

1.33 

Error 

16 

193.46 

-25.20 

4.73 

4.67 


4.67 

Total 

Treatments + 

26 

307.25 

—34.03 

11.67 

6.00 

-3.00 

6.00 

Error 

24 

285.95 

-34.03 

11.67 

6.00 

-3.00 

6.00 


d.f. 


m.s. 


Treatments 

Error 


8 

14 


38.87 4.86 

47.73 3.41 


Treatments + Error 

Columns 

Blocks 


22 

2 

2 


86.60 

199.35 

21.30 


Total 


26 


307.25 


The values involved in this analysis of covariance are easily calculated 
and the whole process is less tedious than the matrix inversion that would 
be necessary if constants were fitted. It is again possible to test particu¬ 
lar sets of treatments. For example, the main purpose of this experi¬ 
ment was to compare plots receiving no lime (treatments 1, 2, 3) with 
those receiving a dressing of limestone (treatments 4, 5, 6) and a dressing 
of slag (treatments 7, 8, 9). This may be done by the analysis of co- 
variance as shown in Table 7. 
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TABLE 7 



d.f. 

n 


s.p. 

s.s. 

s.p. 

s.s. 

Lime 

2 

44.95 

-2.46 

2.95 

0.22 

-0.11 

0.22 

Lime + Error 

18 

238.41 

-27.66 

7.68 

4.89 

-2.11 

4.89 


d.f. 


s.s. 


m.s. 


Variance ratio 


Lime 

Error 


2 

14 


29.62 

47.73 


14.81 

3.40 


4.36 


Lime + Error 


16 


77.35 
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ERRATA 

BIOMETRICS, March 1948, Volume 4, Number 1. 

Under the discussion of “A Quantitative Theory of Genetic Recom¬ 
bination and Chiasma Formation” by R. A. Fisher, the parenthetical 
comment, “(Such a notation was presented orally, but is omitted from 
the written proceedings subject to publication elsewhere.)”, does not 
apply to the discussion by Alexander Weinstein but to that of J. 
Lederberg. 















ON A FORMULA FOR THE PREDICTION 
OF CRANIAL CAPACITY 

C. Radhakrishna Rao 
and 

D. C. Shaw 
Duckworth Laboratory, 

Cambridge, England 

INTRODUCTION 

O ne of the uses of the regression equation is for the prediction of 
the dependent variate for a given set of concomitant variates. 
For instance, a skull may be broken so that the actual cranial capacity 
could not be determined. In such a case the capacity may be capable 
of being predicted, if at least some external measurements are available. 
This requires the construction of the regression equation between the 
cranial capacity and the observed set of the external measurements on 
the skull. 

Various formulae have been constructed for this purpose and the 
most widely used are those by Isserlis (1914), Hrdlifika (1925), Hooke 
(1926), etc. 

In this article we suggest a new formula for the regression equation 
and derive the constants from the measurements given by Hooke (1926) 
for 86 male skulls excavated from the Farringdon Street, London. 

The statistical methods used in comparing various formulae have 
been presented in full for the convenience of biometricians who may be 
interested in such studies. 

The regression equation . 

Three important measurements from which the cranial capacity ( 0) 
could be predicted are the glabella-occipital length (L), the maximum 
parietal breadth ( B ) and the basio-bregmatic height (£P). Since the 
magnitude to be estimated is a volume, it is appropriate to set up a 
regression formula of the type 

C = a' lf x B** H 

where a', ft , ft and ft are the constants to be estimated. Transforming 
the variables to 

y = logio C, £Ci = logic L, x% = log 10 R, x B ■» log 10 H f 
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the formula can be written as 

y = a + fiiXi + P 2 X 2 + ^ 3 X 3 

where a = log 10 a.'. Starting from this equation we propose to estimate 
the constants by the method of least squares. 

Estimation of the constants. 

Using the measurements on the 86 male skulls from the Farringdon 
series w r e find the mean values 

y = 3.1685, Xx = 2.2752, x 3 = 2.1523, x 3 = 2.1128. 

The corrected sum of the product matrix (S i{ ) for x t , x 2 , x 3 is 

.01875 .00848 .00684 

.00848 .02904 .00878 

.00684 .00878 .02886 

The corrected sum of products of y with x x , x 2 and x 3 are respectively 

Q l = .03030, Q 2 = .04410, Q 3 = .03629 

The reciprocal of the matrix (£,-,•) obtained by the c-matrix method of 
Fisher is 

64.21 -15.57 -10.49 

-15.57 41.71 - 9.00 

-10.49 - 9.00 ' 39.88 

The estimates of the parameters are 

bi = 64.21 Q x - 15.57 Q 2 - 10.49 Q z = .878 

b 2 = -15.57 Qi + 41.71 Q 2 - 9.00 Q 3 = 1.041 

b s -10.49 Qx - 9.00 Q 2 + 39.88 Q 3 = .733 

a =y — biXi — b 2 x 2 — b 3 x 3 = —2.618 

The formula for the prediction of cranial capacity 1 is 

C = .00241 L' S7S S 1 ' 041 H'-™ 

Tests of hypotheses . 2 


iThe capacities of Farringdon series skulls were determined by tight packing with mustard seed 
and weighing in the manner described by Macdonell (1904). The formula is strictly applicable for 
predicting capacities determined in this way. 

*The general theory of tests of linear hypothesis is discussed by one of the authors in (Rao, 1946). 
For exact tests of significance it is necessary that the residuals should be normally distributed. It has 
been shown by various writers that the analysis of variance tests hold good provided the departure 
from normality is not large. 
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Having estimated these constants it is relevant to examine how far 
the concomitant variables are helpful in prediction. If these variables 
are of no use then the prediction formula does not depend on them so 
that ft = ft = ft = 0. This hypothesis may be tested from the above 
data. 

The residual sum of squares with (n — 4) d.f. is the minimum value of 
Y (y “ a ” PlXl *“ ft #2 - ito) 2 

which is 

y wy ) b\Qi &2Q2 ^zQz 

(1) - .12692 - .878(.03030) - 1.041(.04410) - .733(.03629) 

- .12692 - .09911 = .02781 

If the hypothesis ft = ft = ft = 0 is true then the minimum value 
, of Y (y — a) 2 is Y V 2 “ n y 2 = . 12692 which is the total sum of squares 
Svith (tz — 1) d.f. The reduction in the sum of squares (1) is due to 
regression. The analysis of the sum of squares is shown below. 


TABLE X 

Test of the Hypothesis ft = /3 2 = /3 8 =* 0 



di. 

S.S. 

M.S. 

F 

Regression 

3 

.09911 

.033037 

97.41 

Residual 

82 

.02781 

.0003391 


Total 

85 

.12692 




The variance ratio 97.41 with 3 and 82 degrees of freedom is significant 
at 1% level which shows that the variables considered above are useful 
in prediction. 

It may now be examined whether the three linear dimensions appear 
to the same degree in the prediction formula. From the estimates it is 
seen that the index ft for maximum parietal breadth is higher than the 
others. This means that a given ratio of increase in breadth counts more 
for capacity than the corresponding increase in length or height. 

The hypothesis relevant to examine this point is 


ft = ft = ft = P (say) 
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The minimum value of L \v< ~ a ~ x,i "t" *«)}* ^ as to 

found out. The normal equation giving the estimate of ft is 


| Sn + $33 

[ +2()Si2 + $23 + 3,0 > 


b = Q - (Qx + Q, + go 


.12485 b = .11069 
b = .8866 

The minimum value -with n — 2 d.f. is 

CL V* ~ ntf) - b Q = .12692 - .09814 = .02878 


TABLE 2 

Test of the Hypothesis ft = ft = ft 



d.f. 

S.S. 

M.S. 

F 

Deviation from 
equality 

2 

.00097 

.000485 

1.430 

Residual 

82 

.02781 

.0003391 


Total 

84 

' 

.02878 

i 



The ratio is not significant so there is no evidence as judged from the 
data to conclude that the /fts are different. The difference, if any, is 
likely to be small and a large collection of measurements may be neces¬ 
sary before anything definite can be said about this. Evolutionists 
believe that the breadth is increasing relatively more than any other 
magnitude on the skull. If this is true it is of interest to examine how 
far the cranial capacity is influenced by the breadth. 

So far as the problem of prediction is concerned the formula 

C = .002342 (LB ff')* SS0a 

obtained by assuming /ft = /ft = /ft may be as useful as the formula 
derived without assuming that these are equal. The variance of the 
estimate b of /3 is a' 2 / X) where a' 2 is the estimate based on 84 d.f. 
with the corresponding sum of squares given in Table 2. 

A simple formula of the type C = a L B H' is sometimes used for 
predicting the cranial capacity. A test of the adequacy of such a formula 
is equivalent to testing the hypothesis /ft = /ft = /ft = 1. 
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The minimum value of 2 (V a “ ^ 2^2 — ^ 3 X 3) 2 assuming 

this to be true is 

(El/i “ n ^ 2 ) + Sn + $22 + $33 + 2$12 + 2/Si3 + 2^23 

- Qi - Q 2 ~ Q* » .14108 

which has (n — 1 ) degrees of freedom. The residual has (n — 4) degrees 
of freedom so that the difference with 3 d.f. is due to deviation from the 
hypothesis. 


TABLE 3 

Test of the Hypothesis ft * ft = ft « 1 



d.f. 

S.S. 

M.S. 

F 

Deviation from 
ft = ft=ft = 1 

3 

.11327 

.03742 

110.3509 

Residual 

82 

.02781 

.0003391 


Total 

85 

.14108 




The ratio 110.3509 with 3 and 82 d.f. is significant at 1 % level. This 
shows that the prediction could be bettered by suitably choosing the 
indices. 

In the above table the sum of squares due to deviation from the 
hypothesis could be directly calculated from the formula. 

E E s u (h - m,- - 1 ) 

= (fit - 1 )\S u (b 1 - 1) + S X2 (b 2 - 1) + S 13 (bz - 1)1 + • • • 

= (&i - l)(Qi - S u - S l2 - S n ) + Q} 2 - 1 )(Q, - S 2X 

— $ 22 S 2 z) + (&3 ““ l)(Qs ““ $31 ~ $32 “ $ 33 ) 

* biQi + &2Q2 + &3Q3 — Qi — Q2 ~~ Q3 + 2 X) $</ 

= .09911 - .11069 + .12485 
= .11327 

which is the same as that given in Table 3 . ; 

Having found that the £ coefficients individually differ from unity, it 
is of some interest to examine whether the indices add up to 3 while 
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distributing unequally among the three dimensions used. This requires 
the test of the hypothesis ft + ft + ft = 3. The best estimate of the 
deviation is ft + ft + ft — 3 = 2.652 — 3 = — .348 
with its variance 

(E E C.-.V = 75.68 <r 2 

The ratio with 1 and 82 d.f. is 

l*gL x —1_4.72 

75.68 A .0003391 

which is significant at 5% level. This shows that the number of dimen¬ 
sions of the prediction formula is not 3. 

The use of the formula for a single skull. 

A skull with L = 198.5, B = 147, H f = 131, i.e., sdj = 2.298, z 2 = 
2.167, x z = 2.117 will have the estimated log capacity as 

y = y + b x {xi - xi) + 62 ( 0:2 - x 2 ) + 63 ( 0:3 - 2 a) 

= 3.2069 

C = anti-log 3.2069 = 1610. 

V(y) = <r 2 |~ + 2 X) - £;) Gov ( 6 t * 6 y )| 

- 0 - 2 { .04187) 

= .0001420 using the estimated value of a- 2 


V(C) = C 2 V(y) approximately 
= 195.2 

The covariance of 6 t *, 6 ,* is the element in the £-th row and j-th column of 
matrix reciprocal to S {j . 

The use of the formula in estimating the mean capacity. 

The formula can also be used to estimate the mean cranial capacity 
of a series of skulls. For this purpose two methods are available. We 
may estimate the cranial capacity of individual skulls, and calculate the 
mean of these estimates, or we may apply the formula directly to the 
mean values of L, B, H f for the series. It is of interest to know if these 
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QUERY : A problem was recently referred to me for criticism and 
61 it now seems that a further opinion is needed. I would appreciate 
your apparisal of the problem. Here is the problem as presented 

to me. 

In a cattle feeding experiment, twelve rations were tried on 12 lots of 
animals in each of two replications., There were two animals per plot. 
The twelve rations consisted of all combinations of four winter rations 
and three summer rations. Table 1 is the analysis of variance presented 
to me. 


TABLE 1 


Analysis of Variance Suivimary 


Source of Variation 

D/F 

MS 

F 

Total 

47 



Btw. winter rations 

3 

9,388.06 

1.02 

Btw. summer rations 

2 

3,308.77 

2.88 

Btw. reps 

1 

28,226.97 

1.64 

WXS 

6 

9 ,*528.99 

4.85* 

W X Reps 

3 

1,862.38 

24.81* 

S X Reps 

2 

52,581.43 

1.14 

W X S X Reps 

6 

46,198.99 

4.00** 

Error 

24 

11,558.92 

* 


My opinion is that the experiment as set up has 24 plots (only 2 reps 
are accounted for in the summary above) and the analysis of variance 
summary should be: 


Total 23 

Winter 3 

Summer 2 

Reps 1 

WXS 6 

WXR 3 

SXR 2 

WXSXR 6 


It seems to me that if all 48 animals were considered separate plots, 
then there are 4 reps and the summary of analysis of variance would be : 

267 
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Total 47 

Winter 3 

Summer 2 

Reps 3 

WXS 6 

WXR 9 

SXR 6 

W X S X R , 18 


Third, if each animal is considered to be a sub-sample of a plot value, 
the analysis could be: 


Total 47 

Winter 3 

Summer 2 

Reps 1 

Sub-samples 1 

WXS 6 

W X Reps 3 

WXSS 3 


WXSXRXSS 6 

From this point of view, the summary originally presented to me was 
satisfactory provided that the difference between sub-samples and all 
interactions in which sub-samples are involved are not significant. From 
a look at the data, this does not seem to be true. Also, the experiment 
apparently does not consider the separate animals as sub-samples. 

I would appreciate your opinion very much. 

None of the suggested analyses appears to be correct. If 
ANSWER; there was some real distinction between the replicates, and 
if the treatments were distributed at random throughout 
each replicate, then the analysis of variance of Table 2 is appropriate. 

The experimental error mean square is appropriate for testing each 
of the effects preceding it in the table. None of the mean squares is 
greater than error so that the treatments are clearly without'effect in 
differentiating the gains in weight. 

It may be well to call attention to the fact that, in the table furnished 
you, most of the values of F are suspect. The sets of hypotheses which 
may be tested, together with corresponding computations and methods 
of using the F-table, have been discussed in this column before (VoL 1, 
page 70; Vol. 2, p. 56). The structure and conduct of this experiment 
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TABLE 2 

Analysis of Variance of Cattle Feeding Data. 
Random Sampling from Two Distinct Replicates. 


Source of Variation 

Degrees of Freedom 

Mean Square 

Replications 

Treatments: 

1 

28,227 

Winter 

3 

9,388 

Summer 

2 

3,309 

Winter X Summer 

6 

9,529 

Experimental Error 

11 

35,268 

Animals within lots 

24 

11,559 


indicate the test of the hypothesis that treatments are not effective in 
producing differences among the gains. In making this test on the winter 
rations, for example, the appropriate value of F is 9,388/35,267 = 0.26. 
The F-table shows that, with degrees of freedom 3 and 11, no value of F 
less than 3.59 is significant at the 5% point. In these circumstances, it 
is inappropriate to refer to the tabulated values of F if the sample value 
is less than one; that is, if the treatment mean square is less than that 
for error. 

From the viewpoint of experimental design, it is interesting to ob¬ 
serve the highly significant (F = 35,268/11,559 = 3.05; F 01 = 3.09) 
intraclass correlation among the gains of the two animals per lot. This 
lack of independence of the gains may be due to the confinement of each 
pair of animals in a common pen. If so, this is a striking illustration of 
one danger inherent in the all-too-common practice of housing together 
animals receiving the same treatment. Other difficulties were dis¬ 
cussed in query number 60 in the preceding number of this Journal. 

Daniel G. Horvitz 


QUERY : Forty Hereford heifers were divided equally into four 
62 separate lots of ten each, and fed the drugs under study by mixing 
these substances in the grain ration. Each lot of cattle was self- 
fed the grain ration. The breeding efficiency and fertility was studied 
through exposing the heifers in each lot to each of two bulls (5 heifers 
bred to each bull). The breeding program was initiated 149 days after 
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the start of the drug feeding period and continued for 13 weeks at which 
time the heifers were slaughtered. 

It should be pointed out that in Lot II, two heifers were found to 
have abnormal reproductive organs and were diagnosed as freemartins. 
These heifers were never in heat and therefore did not influence the 
number of services per conception. With these facts in mind, perhaps 
these two animals should not be included in the data and this lot (Lot II) 
be considered as having 8 heifers instead of 10. 

I am wondering if there is some means of analyzing these data in 
order to see if there is significant difference between the number of 
services per conception for the various lots. 

Table 1 gives the information on breeding of each animal and also 
for each lot. 


TABLE 1 

Data on Breeding in Four Lots op Heifers 



Lot 1 

Control 

Lot 2 
Arsenic 
Trioxide 

Lot 3 
Nux- 
vomica 

Lot 4 
Thiou- 
racil 

Pregnancies after 1 service 

8 

5 

8 

7 

Pregnancies after 2 services 

1 

0 

0 

3 

Pregnancies after 3 services 

0 

1 

1 

0 

Not pregnant: 1 service 

1 

1 

0 

0 

2 services 

0 

1 

1 

! . o 

Freemartins 

0 

2 

0 

0 

Total 

10 

1 

1 

; 

10 

1 ■ 

10 

10 


The answer to your question is clear, but merely to answer 
ANSWER: it would, I fear, be a disservice because the answer is 
apparently not pertinent to your problem. The answer: 
In each lot you have a frequency distribution showing the numbers of 
pregnancies following 1, 2 and 3 services. Following the customary 
methods of computation in frequency distributions, you can get the 
mean and the variance for each lot. These can then be combined into 
the usual analysis of variance for groups. The method is outlined in 
examples 8.8 and 10.16 ofithe 4th Edition of my text, and is used in 
query number 61, which precedes this one. I carried through the test 
and got F = 0.24. 
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To see why I think this would be irrelevant, consider two lots, in one 
of which there occurred only a single pregnancy, this following a single 
service; while in the other lot 10 pregnancies resulted from 10 services. 
In each lot there is one service per pregnancy, but I question whether 
you would consider the breeding efficiency or the fertility the same. 

I have assumed that you used the words “pregnancy” and “concep¬ 
tion” synonymously, yet in each of three lots there were heifers not 
pregnant after one or two services. Presumably they were not serviced 
again because they were not in heat. Can it be that conception took 
place followed by abortion? Can this type of historical data be learned 
from the postmortem? 

As for freemartins, I agree that the postmortem dictates their elimi¬ 
nation from the experiment. In fact, I am surprised that you included 
them originally. 

It is not clear what effect of treatment you wish to evaluate. It 
might be the number of pregnancies following the first service (or the 
second, or the third, or some combination of them), it might be the 
number of sterile heifers or the number of abortions, or it might be some 
combination of all these. It might even be the number of services per 
pregnancy, especially if this were considered in the light of other 
information. 

It seems to me that any result you get may be ambiguous. How can 
you distinguish between the effects of treatments and the effects of bulls? 
The description of this part of the experiment is not clear to me. 

From the foregoing you will see that any exact evaluation of your 
results is difficult if not impossible. However, I am willing to hazard a 
guess!* Considering the small size of your samples, the uniformity of 
your results appeals to me as notable. I guess that the treatments are 
not significantly different. 

George W. Snedecor 


QUERY; Recently, we received data from a field veterinarian 
63 relative to the healing qualities of two drugs. In the data shown 
in Table I, how would you estimate whether or not there is a 
statistically significant difference between the healing qualities of the 
two drugs? 



272 


BIOMETRICS, DECEMBER 1948 


TABLE X 



o 

d 

Aiimals 

Healing 

Drug A 

Drug B 

Poor 

1 

1 

Fair 

6 

4 

Good 

16 

6 

Excellent 

7 

5 

Total 

30 

16 


Assuming random sampling from normal distributions, 
ANSWER: the f-test is applicable. If it is further assumed that the 
degrees of healing can be measured by equally spaced 
numbers, it is convenient to assign the integers, 1-2-3-4, to the four 
categories, poor-fair-good-excellent. The means for drugs A and B 
are then calculated from the two frequency distributions. They turn 
out to be 2.97 and 2.94, so nearly the same that no test of significance 
is required. But for illustration, the two sums of squares of deviations 
from means are computed, 16.67 and 12.82, the sum being 29.49. The 
formula for t is, for this group comparison, 


t = (a:i 


J nghfyh + n a - 2) V /8 

2} v (%+%>iy J 


where and n % are the two sample sizes and a: 8 is the pooled sum of 
squares. 

Substituting: 


t = (2.97 


2.94) 


( (16)(30)(44) V /i! 
\(46)(29.49) / 


0 . 12 , 


with 44 degrees of freedom. 


George W. Snedecor 








THE BIOMETRIC SOCIETY 


One year after its formation at the First International Biometric 
Conference in Woods Hole, the Biometric Society has a total membership 
of 673 (as of October 22). Most of these are affiliated with an organized 
region, 362 with the Eastern North American Region (ENAR), 103 
with the British Region and 55 with the Western North American Re¬ 
gion. The organization of an Australian Region will be completed next 
January, if not before, and so far has enrolled 23 members. The other 
130 members live in areas which have not yet been organized or are 
members-at-large. We hope that some of them will be provided with a 
home before many months have passed. Both a Western European and 
an Indian Region are about to be formed. 

Members of the Society and their colleagues will want to start plan¬ 
ning for the Second International Biometric Conference. This is planned 
for Geneva, Switzerland, late next summer, probably in the period from 
August 30 to September 2. The University of Geneva will be our host 
and Professor Arthur Linder assures us that accommodations will be 
available for every income level. A special organizing committee for 
the conference will be formed in the near future. Travel agencies urge 
that you reserve steamship accommodations at once if there is any 
chance of your being able to attend. Steamship reservations for early 
next summer are scarce. Making a reservation now will not obligate 
you in any way and later it may be extremely difficult or impossible to 
obtain the type of accommodation you want. 

Our conference precedes the next session of the International 
Statistical Institute/which will be held by invitation of the Swiss govern¬ 
ment at Berne or Luzerne from September 4 to 10. By invitation, the 
Biometric Society has applied for affiliation with the International 
Statistical Institute. This will facilitate the coordination of our pro¬ 
grams and minimize the risk of overlooking any vital aspect of our field. 
Biometry was accepted provisionally at a meeting in UNESCO House 
in Paris on October 4 and 5 as a section of the International Union of 
the Biological Sciences. In view of this action the road is now open for 
UNESCO support for our conference in Geneva. 

In carrying out the requirements of the constitution, the Council 
has made several decisions as to procedure. To insure their ready 
availability, these rules have been formulated in a series of “Council 
By-Laws” which are now in process of final revision and adoption. 
They concern finances, the relation of the Regions to the Society, regional 
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officers and dues, nominations for Council, Council elections and inter¬ 
national conferences. It is hoped in this way to provide a permanent 
record for the guidance of all officers of the Society rather than to leave 
these details to chance. 

The application of the Eastern North American Region (ENAR) for 
membership in the Division of Biology and Agriculture of the National 
Research Council has been accepted, William G. Cochran has been 
named to represent the Society in the Division. 

In addition to the regional meeting of ENAR with the American 
Public Health Association in Boston in November, another joint session 
was arranged with the American Association of Economic Entomologists 
for the evening of December 13 at their annual meeting in New York. 
The session consisted of an informal biometric “clinic” on entomological 
problems submitted by the entomologists. A panel of statistical* and 
biometrical “experts” had the job of answering these questions to the 
satisfaction of the entomologists. The most extended series of programs 
of ENAR were those arranged jointly with the Biometrics Section of the 
American Statistical Association at the annual meeting in Cleveland in 
late December. They will be reported in the next issue of Biometkics. 

The By-Laws adopted by the British Region on March 31 and since 
approved by the Council are as follows: 

“As a division of the Biometric Society, the British Region is 
governed by the Constitution of the Society and the following 
Regional Rules. 

1. The region shall endeavour to promote quantitative biology 
in all its aspects. 

2. Membership of the Region is open to residents in the British 
Isles, and to British scientists resident in other countries. 

3. The business of the Region shall be conducted by a Com¬ 
mittee consisting of the Vice-President for the Region, the Secretary, 
Treasurer, and six Ordinary Members together with any Members 
of the General Council of the Society who belong to the Region. 
Pour shall form a quorum. 

4. The Vice-President, Secretary and Treasurer shall retire 
annually but shall be eligible for re-election. Two ordinary members 
shall retire each yeas by seniority in order of election and shall not 
be eligible for re-election to ordinary membership of the Committee 
until a year has elapsed. The Committee shall have power to fill 
casual vacancies in their number, subject to the approval of the 

. next Annual Meeting. Any member so appointed to a casual va- 
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cancy shall hold office only for the unexpired term of his predecessor; 
but shall be eligible for immediate re-election. 

5. There* shall be an Annual Meeting of the Region during the 
months of March and April, and such other meetings as the Com¬ 
mittee may decide. At least ten days notice shall be given of all 
meetings, other than the Annual Meeting or Special Meetings. 

6. At least six weeks notice shall be given of the Annual Meet¬ 
ing. At the same time there shall be sent to each member a list of 
the Regional Officers and Members of Committee indicating those 
due to retire, and inviting nominations to fill the vacancies. Nom¬ 
inations must be received by the Secretary at least four weeks before 
the date of the Annual meeting, and must be signed by at least two 
members of the Region and must be accompanied by a declaration, 
signed by the nominee, that he is willing to serve if elected. 

7. No ballot shall be taken if the nominations are insufficient 
or just sufficient to fill the vacancies, and in the former case the 
Committee shall make such additional nominations as are required 
to fill the remaining vacancies. All the persons so nominated shall 
be deemed elected, pending action by the Council. 

8. If there should be more than one nomination for any office, 
or more nominations for ordinary membership of the Committee 
than there are vacancies, a postal ballot shall be held. At least 
two weeks before the Annual Meeting each member shall be sent a 
ballot paper containing a list of those vacancies for which a ballot 
is to be held, together with the names of the persons validly nomi¬ 
nated to fill them. The ballot papers shall be returned not later 
than the commencement of the Annual Meeting, at which a count 
shall be made and the result declared. 

9. A Special Meeting shall be convened within eight weeks of the 
receipt by the Secretary of a request signed by not less than twelve 
members. A notice stating the purpose of the meeting shall be sent 
to each member not less than two weeks before it is to be held. 

10. The Committee shall receive nominations of candidates for 
membership of the region, and shall forward those deemed appropri¬ 
ate to the General Council. 

11. A member may be expelled from the Region only on the 
proposal of the Committee and by a majority of two-thirds of those 
present and voting at an Annual Meeting, after at least two weeks 
notice has been given. 

12. The subscription of each member shall be £1, which becomes 
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due on election to the Region, and subsequently on 1st. February 
each year. From the fund so constituted the contributions to the 
general funds of the Society shall be paid and secretarial expenses 
and other such costs as the proper conduct of the Society demands 
may be defrayed. 

13. The membership of any member who is three or more years 
in arrears with his subscription may be terminated on a vote of the 
Committee. No application to rejoin the Society shall be enter¬ 
tained until the unpaid subscriptions have been discharged. 

14. A statement of the Region’s finances shall be presented to 
the Annual Meeting by the Treasurer, after the accounts have been 
audited by an Hon. Auditor appointed at the previous Annual 
Meeting for the purpose. The Hon. Auditor shall not be an 
Officer or Member of the Committee. 

15. These rules may be amended at an Annual Meeting or a Spe¬ 
cial Meeting convened for the purpose, after six weeks notice has 
been given, by a majority of two-thirds of those voting, a postal 
ballot being taken of the same kind as for the election of Officers.” 


It is with great sorrow that we report the loss of one of our most 
active and able members. Professor John H. Watkins of Yale Uni¬ 
versity died suddenly on September 25. As Secretary-Treasurer of 
ENAR and at the same time of the Biometrics Section of the American 
Statistical Association, he has done invaluable service in coordinating 
the activities of these two organizations. His work on hospitalization 
statistics for the Army in World War II and on hospital and public 
health statistics in New Haven before and after the war was outstanding. 
He will be sorely missed by his many friends and colleagues. 



NEWS AND NOTES 

CANADA —G. C. Ashton, Assistant Professor of Nutrition, Mac¬ 
Donald College, Quebec, has made a few comments regarding the value 
of statistical methods for research in Nutrition. “Modern statistics pro¬ 
vide a means of condensing masses of nutrition data which allows 
interpretation of same which would otherwise be impossible. One of the 
most characteristic attributes of biological organisms is their variability. 
Nutrition research is no less affected by this quality than other branches 
of biology and one of statistics’ great aids in this field of research has 
been to provide an effective means of measuring this variability. Sta¬ 
tistical development has aided in indicating the most suitable experi¬ 
mental designs with which to resolve nutritional problems. With these 
designs, normal interactions can be allowed to occur and their effect 
and extent determined. Design research has given set-ups which indicate 
the most effective use of the experimental animals thereby cutting costs 
of research in terms of time involved, feed required and monetary 
outlay, while at the same time increasing the precision of the estimates.” 
Much of Mr. Ashton’s time is taken up with guiding graduate students 
in the application of statistics to their problems. 

ENGLAND —N. T. Gridgeman, Eastham, Cheshire, writes, “Bio¬ 
metrics is good. I’m shocked to read that my friend D. J. Finney depre¬ 
cates what he calls the 'social gossip’. News and Notes is an admirable 
feature; one’s natural curiosity about fellow scientists in other countries 
is all too seldom met. Until a man dies or unless he does something 
notable enough to engage the attention of the lay press, nothing but his 
name and address are available. This is as surely wuong as your repara¬ 
tion is surely right.” This is an encouraging note for those who assemble 
this news about Biometrics subscribers. One more objection has been 
received from a person who is concerned by the undignified “News and 
Notes”. Three down, but still we go .. . Major I. B. Perrott, Solihull, 
Birmingham, is to serve as “News Editor” for the British Region of 
The Biometric Society. Send your news to him. He has taken up recently 
a position in pure mathematics in the Department of Mathematics, 
University of Leeds. 

jvrffc- 

INDIA —Raj Chandra Bose has resigned as head of the ^graduate 
Department of Statistics, University of Calcutta. He has been appointed 
Professor of Mathematical Statistics, University of North Carolina, 
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beginning in the Winter of 1949. Professor Bose is an authority on the 
Hasi pti of experiments and is •writing a book on the combinatorial mathe- 
mathics of the subject. He served as Visiting Professor at Columbia 
University during the Fall of 1947 and was at the Institute of Statistics 
durin g the Winter and Spring of 1948... C. Chandra Sekar, Professor 
of Statistics at the All-India Institute of Hygiene and Public Health at 
Calcutta, who was a student at The Johns Hopkins School of Hygiene 
and Public Health last year, is remaining in this country for a second 
year as a member of the Population Division of the United Nations. 

UNITED STATES —Isidore Altman, Biostatistician, Public Health 
Service, Washington, D. C., is directing his efforts toward problems in 
medical economics, particularly the collection of information on the 
number and distribution of medical personnel and facilities and on the 
cost of medical care. He writes, “Two illustrative studies are (a) an 
analysis of the supply of physicians in the District of Columbia and of 
their patient load, and (b) a study of the medical care sought by older 
persons in the Eastern Health District of Baltimore, with particular 
interest in chronic disease and its social and economic aspects.” ... 
Huldah Bancroft, The School of Medicine, Tulane University of Louisi¬ 
ana, New Orleans, has charge of teaching Biostatistics for the Depart¬ 
ment of Tropical Medicine and Public Health. She teaches an under- 
; graduate course in biostatistics which is required of all sophomores in 
the School of Medicine. Also, a course is being given for faculty mem¬ 
bers. Miss Bancroft serves as a consultant for the school. She has as 
her assistant Margaret Allen who was formerly at Harvard with E. B. 
Wilson and, during the war, with the Navy... Edward W. Bar ankin 
was promoted to Assistant Professor and Research Associate at the 
Statistical Laboratory, University of California, Berkeley... Goeffirey 
Beall moves from paper to glass. He was statistician at the Institute 
of Paper Chemistry, Appleton, Wisconsin. He now has a similar position 
at the Preston Laboratories, which are concerned with research on glass. 
The Preston Laboratories are located at Butler, Pennsylvania... Bernice 
Brown, Project Rand, Santa Monica, California, does not like a state¬ 
ment that several biological statisticians have gone astray into industry. 
Sh’e responds, “I object to the word ‘astray’. There are gadgets in 
industry which behave very much like pigs and rats. The opportunity 
for learning more about statistics exists here under conditions not unlike 
those of an experiment station.” Is she sold on California! Is it the 
climate?... W. V. Charter, Deputy Director, Medical Statistics Division, 
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Bureau of Medicine and Surgery, Navy Department, responded to an 
inquiry regarding the work of his division thus, “We have approximately 
75 people in this Division of whom 7 are professional statisticians 
comprising what we know as a Staff. The other personnel are divided 
into administration, editing and coding, tabulating, and machine opera¬ 
tions people. Since this is the focal point of all medical statistics in the 
Navy, we are responsible for format, organization, instructions, etc, 
concerning casualties, mortalities, and all other morbid conditions. We 
receive individual patient reports on every man who turns in, in the 
Navy, whether it be here in Washington, San Francisco or Shanghai!” 
He tells of a vast amount of other medical statistics data which they 
receive. The Division publishes a monthly magazine “Statistics of Navy 
Medicine”. The military service have compulsory reporting as well as 
having a direct knowledge at all times of their current population. .. 
W. G. Cochran of the Institute of Statistics, University of North Caro¬ 
lina, has accepted an appointment as Professor of Biostatistics at the 
School of Hygiene and Public Health of The Johns Hopkins Uni¬ 
versity. He will take up his new post the first of the year... James F. 
Crow formerly at Dartmouth College, Hanover, New Hampshire, is 
now with the Department of Genetics, The University of Wisconsin, 
Madison. He is teaching courses in genetics and doing Drosophila re¬ 
search. . . W. Edwards Deming, Division of Statistics, Department of 
the Budget, Washington, and his wife have returned from a two months’ 
sojourn to various parts of Europe. Mr. Deming attended the meeting 
of the United Nations Sub Commission on Statistical Sampling in 
Geneva, and held consultations on sampling and the control of quality 
in Rome, Milan, Paris, Amsterdam, Luden, and Den Haag. He reports 
that there if much interest and progress in statistical methodology in 
all these places in government agencies, national standardizing bodies, 
manufacturing industries, public opinion and market research organizar 
tiohs... Max Halperin, graduate student with the Department of 
Mathematical Statistics, University of North Carolina, Chapel Hill, 
has joined the Project Rand group at Santa Monica, California... 
William F. Hewitt, Jr., was a physiologist-pharmacologist and literature 
scientist in the Literature Research Department, Smith, Kline and 
French Laboratories, Philadelphia. He is now an Assistant Professor of 
Physiology, School of Medicine, Howard University, Washington, D. C. 
He states, “In addition to teaching and conducting laboratory research, 
I am trying to establish a literature-science unit, one of the functions of 
which will be to instruct and consult in experimental design and judg- 
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merit, of evidence.” Mr. Hewitt will welcome suggestions as to possible 
activities of an academic group of this type... Joseph L. Hodges, Jr., 
has been promoted to Instructor and Research Associate at the Sta¬ 
tistical Laboratory, University of California, Berkeley.. . Carol M. 
Jeager is doing statistical analysis work with the Bureau of Ships, 
Navy Department, Washington. Miss Jeager was formerly with the 
Department of Agriculture’s Northern Regional Research Laboratory, 
Peoria, Illinois... E. L. Leclerg has left the Bureau of the Budget to 
join the Agricultural Research Administration in Washington as Re¬ 
search Coordinator responsible for the field of crop production... 
Douglas E. Scates has taken a new post with The American Council on 
Education, in charge of their research in scientific personnel for contracts 
sponsored by the Office of Naval Research. He was with the Department 
of Education, Duke University, Durham, North Carolina... D. M. 
Seath recently left the Louisiana Agricultural Experiment Station, Baton 
Rouge. He is now a Professor of Dairy Husbandry, University of 
Kentucky, Lexington, and is in charge of the Dairy Section... Arthur 
G. Steinberg who was with the Fek Research Institute, Antioch College, 
Yellow Springs, Ohio, is now a member of the staff of the Division of 
Biometry and Medical Statistics, Mayo Clinic, Rochester, Minnesota... 
B. L. Wade is now head of the Department of Horticulture, University 
of Illinois, Urbana. He expects to put considerable emphasis on the 
development of graduate work in horticulture. Mr. Wade served for 
several years as Director of the Regional Horticulture Laboratory, 
Charleston, South Carolina. To him is due considerable credit for 
promoting cooperative research in the Southeast... J. Yerushalmy is 
now Professor of Biostatistics, School of Public Health, University of 
California, Berkeley. He is continuing with the studies on statistical 
problems in assessing Methods of Medical Diagnosis. Mr. Yerushalmy 
was formerly with the Tuberculosis Control Division, U. S. Public 
Health Service, Bethesda, Maryland. There is some comfort to know 
that others have the problem of staffing their department and planning 
the teaching and research program. 
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Populations, 21 
Prediction, 247, 249 
Probit, 197 
Pseudovariate, 241 
Pure variance, see variance, pure 
Quantal response, 198 
Quasifactorial design, 59 
Radiobiology, 139 

Randomized blocks, 74,103, 234, 244 
Recombination, 1, 7 
Rectangular lattice, 137 
Regression, 22, 50, 158, 182, 247, 255 
Replication, fractional, 59 
Sampling, 223, 234 
systematic, 223 
Sampling variance, 226, 235 
Scedasticity, 25 
Serology, 141, 143 
Soil heterogeneity, 113 
Source of variation, 101, 199 
Split plots, 118 
Statistical control, 161 
Statistical model, 168 
Statistics, 277 
Statistics courses, 87 
Teaching statistics, 87 
Test of linearity, 104 
Test of parallelism, 104 
Test of significance, 214, 244, 249, 263 
Transformations, 211 
T test, 272 
Variance, 30, 244 
complex, 135 

components, 227, 235, 254, 256, 260 
homogeneity of, 125 
mean, 71, 110, 115, 118 
pure, 235 

sampling, 227, 235 
Weights, 119 
Wound healing, 51 





